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Our  primary  objective  is  to  consider  a  special  class  of  first- 
order  autoregressive  multivariate  time  series  models  in  which  the 
individual  series  correspond  to  locations  on  a  plane.   Conditioned  on 
the  past,  the  expected  response  at  a  given  location  for  a  given  time 
period  is  taken  to  be  a  linear  combination  of  the  immediate  past 
response  at  that  location  and  a  weighted  average  of  the  immediate  past 
responses  at  the  other  locations.   If  the  weights  are  not  assumed  to  be 
known,  an  exponential  weight  function  of  the  interlocational  distances 
is  used.   (We  refer  to  this  as  the  variable  weights  case.)   The  form 
of  the  weighting  function  is  quite  flexible  in  that  it  allows  for  a  wide 
range  of  weighting  schemes  which  might  be  appropriate  in  various  applica- 
tions to  both  regular  and  irregular  arrays  of  locations.   Parameters  of 
interest  are  the  two  linear  coefficients  and  a  parameter  in  the  weight 
function  (in  the  variable  weights  case) . 

An  estimation  procedure  is  proposed  which  takes  into  account 

the  spatial  nature  of  the  process  through  modification  of  the  usual 

I 

vii 


Yule-Walker  estimators.   Using  the  results  for  the  usual  Yule-Walker 
estimators,  ours  are  shown  to  be  consistent  (in  probability)  and  asymp- 
totically normally  distributed  for  both  the  known  and  variable  weights 
cases . 

A  benefit  of  our  approach  to  the  spatial  time  series  problem  is 
that  we  obtain  straightforward  asymptotic  tests  for  location,  neighbor, 
and  distance  effects.   Asymptotic  joint  confidence  ellipsoids  are  also 
given  for  these  parameters.   We  develop  an  approximation  to  the  variance- 
covariance  matrix  of  the  k-step  prediction  errors  in  using  the  fitted 
general  first-order  autoregressive  model.   The  necessary  modifications 
of  this  matrix  for  the  spatial  model  are  given. 

We  present  consistent  estimators  of  the  variance-covariance 
matrices  of  the  error  term  and  the  time  series.   This  allows  us  to 
consistently  estimate  all  other  variance-covariance  matrices  encoun- 
tered in  our  work. 

Some  simulation  results  are  presented  which  indicate  that  the 
performance  of  our  estimators  depends  on  the  location,  neighbor,  and 
distance  effects  as  well  as  array  characteristics.   There  does  not 
appear  to  be  one  model  specification  for  which  all  estimators  perform 
well  except  for  large  (by  time  series  standards)  samples.   An  actual 
data  example  is  also  analyzed. 

The  methodology  developed  is  flexible  so  that  it  can  have  a  wide 
range  of  application.   The  procedures  presented  suggest  the  possibility 
for  extension  of  these  results  to  other  first-order  autoregressive  models, 
both  spatial  and  nonspatial,  for  which  restrictions  are  placed  on  the 
coefficient  matrix. 
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CHAPTER  I 


INTRODUCTION 


1.0  Preamble 


The  spatial  problem  being  investigated  is  introduced  in  Section  1.1 
by  considering  several  examples  which  serve  as  motivation  for  our  work. 
After  a  review  of  the  literature  in  Section  1.2,  we  describe  our  approach 
to  this  problem  in  Section  1.3.   An  outline  of  the  results  to  be  presented 
is  given  in  Section  1.4.   Section  1.5  introduces  our  notation  and  format 
for  the  dissertation. 

1.1  Introduction  to  the  Spatial  Problem 
Many  physical  processes  generate  multivariate  responses  for  which 
the  components  of  the  vector  response  are  associated  with  distinct  points 
in  a  plane.   These  responses  may  be  repeated  over  time.   Such  processes 
are  referred  to  as  spatial- temporal  processes.   For  example,  several 
weather  stations  might  be  located  throughout  a  region,  with  each  station 
monitoring  local  conditions  on  a  regular  basis.   Suppose  temperature 
readings  are  recorded  every  hour  at  each  station.   We  can  regard  the  vec- 
tor responses  of  hourly  temperatures  as  a  multivariate  time  series. 
In  addition  to  expecting  a  relationship  among  the  vector  responses  over 
w  time,  we  might  expect  a  spatial  relationship  among  the  components  of  the 
vector  since  the  individual  variates  correspond  to  particular  locations 
in  a  region.   In  particular,  we  might  expect  that  there  is  a  "distance 
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effect"  among  the  locations,  with  the  responses  from  those  stations  that 
are  close  together  being  perhaps  more  strongly  related  than  responses 
from  stations  that  are  far  apart.   We  refer  to  multivariate  time  series 
of  this  type  as  spatial  time  series. 

For  our  real  data  example  in  Chapter  VI ,  we  consider  unemployment 
rates  for  ten  centers  in  southwestern  England.   Each  month,  the  unemploy- 
ment rate  is  determined  for  the  region  corresponding  to  each  center. 
These  monthly  rates  for  all  ten  centers  constitute  a  spatial  time  series. 

In  modeling  a  spatial  time  series,  our  objective  in  this  paper  is 
to  model  the  nondeterministic  component  of  the  series.   (We  expect  to  con- 
sider the  deterministic  component  as  well  in  future  research.)   With  this 
objective  in  mind,  we  consider  the  simplest  autoregressive  model,  the 
first-order  model  for  which 

Xt  =  B  v.t_1  +  et,  (1.1.1) 

where  v_  is  the  vector  response  at  time  t,  e      is  an  unobservable  random 
error  vector,  and  B  is  an  n  x  n  matrix  of  coefficients.   In  the  general 
model,  it  is  not  assumed  that  B  has  a  specific  structure  which  would 
reflect  the  spatial  nature  of  the  series.   Consequently,  in  applying  only 
the  general  estimation  schemes  for  B  to  a  spatial  problem,  we  are  not  ex- 
plicitly accounting  for  the  spatial  aspects  of  the  phenomenon  under  study. 
It  would  seem  desirable  to  assume  a  structure  for  B  that  reflects 
the  spatial  nature  of  the  process.   In  particular,  in  considering  a 
response  at  a  given  location  at  time  t,  it  would  be  of  interest  to  con- 
sider the  relationship  of  that  response  to  a  response  at  the  same  loca- 
tion and  to  responses  at  neighboring  locations  in  the  previous  time  period. 
Factors  such  as  distance  should  enter  into  the  consideration  of  the  rela- 
tionship with  a  particular  neighbor. 


By  assuming  such  a  structure  for  B  and  developing  estimation 
procedures  based  on  this  structure,  we  hope  to  model  the  underlying 
process  which  generated  the  series.   In  addition,  the  structural  assump- 
tions would  probably  mean  a  reduction  in  the  number  of  parameters  in  the 
model.   A  parsimonious  parameterization  is  desirable,  provided  that  such 
a  model  adequately  describes  the  process,  since  such  a  parameterization 
allows  more  efficient  usage  of  the  sample  information.   A  model  which 
incorporates  both  the  spatial  and  time  aspects  of  the  process  would  seem 
to  be  a  better  forecasting  tool  than  a  model  which  only  includes  the  time 
aspect. 

Before  going  into  more  detail  on  our  approach  to  this  problem, 
we  review  the  literature  on  related  problems. 

1.2  A  Literature  Review 

Much  of  the  work  in  the  general  area  of  spatially  related  random 

variables  has  been  done  with  purely  spatial  processes,  where  both  joint 

and  conditional  models  have  been  considered.   For  the  joint  model,  the 

response  at  location  i  is  related  to  the  responses  at  the  other  locations, 

simultaneously.   Specializing  a  joint  model  to  the  linear  case,  we  have 

that 

y.  =  I     6,,  y.  +  e. ,  (1.2.1) 

i   .  / .  i  j      i 

where  e.  is  a  random  error  term.   For  the  linear  conditional  model,  the 
l 

relationship  is  such  that 

E(y.  I  responses  at  other  locations)  =  E   Y.  .  '<< .  •  (1.2.2) 

At  first  glance,  it  would  appear  that  taking  the  expectation  of  y.  in 
(1.2.1)  conditional  on  the  responses  at  the  other  locations  would  yield 


(1.2.2)  with  Y       =  B    for  all  i  and  j  /  i.   However,  this  is  not  the 
case  since  the  error  term,  £.,  is  not  independent  of  the  y.'s.    Bartlett 
(1974),  Besag  (1974),  Brook  (1964),  Cliff  and  Ord  (1975),  and  Ord  (1975) 
give  more  complete  discussions  of  the  differences  between  the  two  speci- 
fications. 

Many  of  the  specific  results  for  spatial  processes  are  for  . 
regular  arrays  (for  example,  rectangular)  of  locations.   Restrictions  are 
usually  placed  on  the  coefficients  in  (1.2.1)  and  (1.2.2).   For  example, 
a  simple  first-order  joint  model  on  a  regular  lattice  is  given  by 

y..=B(y.  1.+y.  .  ,  +  y   .    +v      )  +  E 
7ij    v/i-l,j   yi,j-l   yi+l,j   yi,j+r    ij' 

where  the  subscripts  correspond  to  the  coordinates  of  the  location. 

The  correlation  structure  (or  spectral  function)  of  some  of  the 
joint  models  (or  their  continuous  analogues)  are  considered  by  Bartlett 
(1974),  Besag  (1972),  Heine  (1955),  and  Whittle  (1954).   Whittle  (1954) 
developed  a  maximum  likelihood  estimation  scheme  for  the  parameters  of 
the  spectral  function. 

Besag  (1974),  a  major  proponent  of  a  conditional  approach, 
discusses  a  class  of  conditional  models  called  auto-models.   Examples  are 
the  auto-normal,  auto-binomial,  and  auto-logistic.   These  models  are 
specified  by  the  probability  (or  density)  function  of  y.  conditional  on 
the  response  at  all  other  locations.   Although  these  models  can  be  speci- 
fied for  both  regular  and  irregular  arrays  of  locations,  the  statistical 
analysis  is  generally  limited  to  the  regular  lattice  cases.   Besag  (1974) 
shows  that  it  can  be  quite  difficult  to  use  maximum  likelihood  procedures 
directly  and  thus  discusses  two  alternative  approaches.   The  first  relies 
on  a  subsetting  of  the  responses  (which  is  called  coding)  which  results 


in  a  simpler  likelihood.   In  the  second,  another  simpler  maximum  likeli- 
hood procedure  results  when  a  unilateral  approximation  to  the  original 
process  is  used.   (For  the  unilateral  approach,  the  concept  of  one- 
directional  dependency  in  an  autoregressive  time  series  is  extended  to 
two  dimensions.)   Besag  and  Moran  (1975)  use  the  coding  procedure  to 
develop  a  test  of  spatial  dependency  for  an  auto-normal  process. 

Although  irregular  arrays  may  be  less  attractive  mathematically, 
they  are  of  interest  for  practical  reasons  since  many  spatial  processes 
occur  naturally  on  irregular  arrays.   Cliff  and  Ord  have  done  extensive 
work  in  this  area.   Their  approach  has  been  to  specify  weights  that  are 
functions  of  array  characteristics  such  as  interlocational  distances  and 
region  size.   (See  Cliff  and  Ord  (1969,  1975),  Cliff,  Haggett  et  al. 
(1975:148-149,  161),  or  Mead  (1971)  for  examples.)   For  example,  a  joint 
model  could  be  specified  such  that 

n 

Yi  =  P   Z   Wii  yi  +  V  (1.2.3) 

j  =  l 

where  the  w   's  are  known  weights  and  e .  is  a  random  error  term.  (The 
approach  also  extends  to  the  conditional  case.)  A  natural  extension  would 
be  for  a  restricted  parameterization  of  the  weights  so  that  sample  infor- 
mation could  be  used  to  estimate  them. 

Two  types  of  inference  problems  are  considered  for  Cliff  and 
Ord- type  models.   The  first  involves  tests  for  spatial  autocorrelation 
and  the  second  involves  parameter  estimation.   Cliff,  Haggett  et  al . 
(1975:152-155)  present  a  parameteric  test  (under  normal  assumptions) 
and  a  nonparametric  test  of  H   :  p  =  0,  where  p  is  as  in  (1.2.3)  or  its 
conditional  analogue.   Both  test  statistics,  under  the  null  hypothesis, 


have  asymptotic  normal  distributions  (as  n  ->  <»)  .   Cliff  and  Ord  (1972) 
develop  a  similar  test  for  spatial  correlation  among  the  error  residuals 
in  a  linear  regression. 

Maximum  likelihood  estimation  procedures  (under  normal  assump- 
tions) are  presented  by  Ord  (1975)  for  both  the  model  in  (1.2.3)  and  an 
extension  which  included  regressor  variables.   Maximum  likelihood  proce- 
dures for  some  other  models  are  outlined  in  Cliff  and  Ord  (1975). 

Another  approach  to  modeling  spatial  processes  has  been  to  think 

of  the  responses  as  a  surface  and  fit  polynomial  models  of  the  form, 

m   r       . 
y  =  Z   I      6   x*  xJ  +  e, 
i=0  j=0   J  L     z 

where  x  and  x  are  the  map  coordinates  and  e  is  a  random  error  term. 

(See  Cliff,  Haggett  et  al.  (1975:49-70).)   This  is  an  example  of  a  trend 

surface  model. 

A  somewhat  different  class  of  spatial  processes  is  the  class  of 
spatial  point  processes.   These  processes  are  characterized  by  the  dis- 
tribution of  points  across  a  region.   The  literature  is  fairly  extensive 
in  this  area.   Two  important  types  of  analysis  of  point  processes  are  the 
distance  methods  and  the  quadrat,  count  methods.   A  sampling  of  results 
for  these  and  related  methods  can  be  found  in  the  work  of  Diggle  (1975), 
Holgate  (1972),  Mead  (1974),  Rogers  (1974),  and  Strauss  (1975). 

Spatial-temporal  processes  are.  an  extension  of  purely  spatial 
processes.   Both  Granger  (1969)  and  Cliff,  Haggett  et  al.  (1975:107-141) 
have  used  standard  multivariate  time  series  techniques  (cross-spectral 
analysis)  in  comparing  time  series  corresponding  to  locations  in  a  region. 

Cross-sectional  time  series  analysis  may  be  appropriate  for  some 
spatial  problems  where  the  cross  sections  are  taken  over  regions  or 


locations.   Swamy  and  Mehta  (1977)  consider  a  linear  model  for  cross- 
sectional  time  series  in  which  the  coefficient  vector  is  taken  to  be  the 
sum  of  a  mean  vector  and  two  random  components.   One  component  varies 
over  time  and  among  individuals  (which  could  be  locations)  and  the  other 
varies  only  over  individuals. 

Fuller  and  Battese  (1974)  consider  estimation  of  a  linear  model 
for  cross-sectional  time  series  but  assume  an  error  term  which  is  the 
sum  of  location  and  individual  components  (possibly  random)  and  another 
random  component.   Both  Maddala  (1971)  and  Nerlove  (1971)  have  studied 
estimation  for  error-component  linear  models  (somewhat  similar  to  Fuller 
and  Battese' s  model)  which  contain  a  single  lagged  value  of  the  depen- 
dent (univariate)  variable. 

Cliff  and  Ord  (1970)  discuss  estimation  schemes  and  testing 
procedures  for  the  coefficient  vectors  of  a  linear  model  for  cross- 
sectional  time  series.   Constraints  on  the  coefficient  vectors  such  as 
equality  for  all  individuals  (or  over  time)  are  considered.   They  also 
develop  some  estimation  procedures  when  the  coefficient  vector  is  random. 
Although  we  found  a  number  of  related  problems  in  our  literature 
review,  we  found  little  evidence  of  statistical  procedures  developed  for 
a  spatially  restricted  coefficient  matrix  for  the  model  in  (1.1.1). 
This  research  develops  such  procedures. 

1.3  Our  Approach  to  the  Problem 

In  Section  1.1,  we  suggested  that  a  first-order  spatial  time 

series  model  should  incorporate  location,  neighbor,  and  distance  effects 

in  the   structure  of  B.   We  will  do  this  by  considering  the  response 

for  time  t  at  location  i,  y   . ,  to  be  of  the  form 

t,i 


yt,i  ■  a  Vl.i  +  b  ^  wij  Vl,j  +et,i>        C1.3.D 

J* 

where  e    is  a  random  error  term,  a  and  b  are  parameters  whose  values 
L » -1- 

are  unknown,  and  n  is  the  number  of  locations  in  the  array.   The  w   's 

ij 
are  weights  which  may  be  completely  known  or  contain  one  or  more  param- 
eters to  be  estimated  from  the  sample  information.   We  make  three  assump- 
tions concerning  the  weights. 

Al:   For  all  w. .,  0  <  w   <  1. 

—  ij       ij 

A2:   For  all  i,  w. .  =  0. 

—  n 

A3:   The  weights  are  scaled  to  add  to  unity  for  each  location, 
n 
That  is,   E  w. .  =  1    for  all  i. 
J-l  1J 

Since  y  already  enters  the  model  with  a  as  its  coefficient, 
we  set  wii  =  0  for  all  i.  The  other  two  assumptions  are  made  to  provide 
a  consistent  class  of  models.  (For  example,  the  total  weight  should  not 
depend  on  the  number  of  locations  in  the  array.)  The  necessity  of  these 
assumptions  will  be  seen  as  they  are  used  in  the  derivation  of  certain 

results  in  later  chapters. 

n 
By  considering  all  three  assumptions,  we  see  that   £  w   y 

j=1   ij   t-l,j 

is  just  a  weighted  average  of  the  responses  at  time  (t-1)  for  all  loca- 
tions other  than  i.   It  follows  that  the  parameters,  a  and  b,  can  be 
regarded  as  accounting  for  a  location  effect  and  a  neighbor  effect, 
respectively.   If  a  is  zero,  only  the  neighboring  locations  of  i  are 

explicitly  related  to  y   ..   However,  if  b  =  0,  none  of  i's  neighbors 

t  >  i 

appears  explicitly  in  the  model  for  y   ..   (By  a  neighbor  of  location  i, 

*- » ^- 

we  mean  any  location  other  than  i  and  not  just  contiguous  neighbors.) 


The  nature  of  the  distance  effect  among  the  neighbors  would  determine  the 
form  of  thV  weights.   If  a  distance  effect  is  to  be  considered,  there 
must  be  at  least  two  different  interlocational  distances,  and  thus,  the 
need  for  an  additional  assumption. 

A4:   There  are  at  least  three  locations  in  the  array.   If  there 
are  exactly  three,  the  array  is  not  in  the  form  of  an 
equilateral  triangle. 

The  model  in  (1.3.1)  is  a  specific  case  of  a  more  general  model 
suggested  in  Cliff,  Haggett  et  al.  (1975:202).   By  referring  to  the  model 
in  (1.3.1)  as  "our  model,"  we  do  not  intend  to  suggest  originality  on  our 
part  in  the  model  formulation,  but  we  do  develop  original  methods  of 
parameter  estimation,  particularly  in  the  variable  weights  case.  We  also 
refer  to  this  model  as  "the  spatial  model." 

Writing  the  model  in  (1.3.1)  in  matrix  form  yields 
y_t  =  (a  In  +  b  W)yt_1  +  e^ 

where  W  is  the  matrix  of  weights  (all  diagonal  terms  are  zero)  and  I  is 

n 

the  n  x  n  identity  matrix.   We  summarize  the  restrictions  on  B  as  follows. 
A5 :   For  a  first-order  autoregressive  spatial  time  series,  the 
model  in  (1.1.1)  is  such  that  B  =  B  ,  where 


B   . .  =  a       for  all  i 
r,n 


and 


B   . .  =  b  w. .    for  all  i  and  i  ^  i. 

With  this  model  specification,  one  objective  is  to  estimate  a,  b, 
any  parameters  in  the  weight  function,  and  the  variance-covariance  matrix 
of  the  error  terms.   Another  objective  is  to  make  the  modifications  neces- 
sary to  use  this  model  in  forecasting. 


10 


1.4  An  Outline  of  Our  Results 
We  consider  two  cases  of  the  spatial  model.   In  the  first  the 
weights  are  assumed  to  be  completely  specified  (the  known  weights  case) 
and  in  the  second,  the  weights  are  of  a  specific  form  but  contain  a 
parameter  to  be  estimated  (the  variable  weights  case). 

In  Chapter  II,  we  develop  estimation  schemes  for  the  location  and 
neighbor  parameters  in  both  cases  and  also  for  the  distance  effect 
parameter  in  the  variable  weights  case.   These  schemes  involve  modifica- 
tion of  the  usual  Yule-Walker  estimators  according  to  the  specific  struc- 
ture assumed  for  B  (i.e.,  B  ). 

In  Chapter  III,  we  show  the  existence  of  finite-valued  estimators 
using  these  schemes.   These  estimators  are  also  shown  to  be  consistent 
(in  probability)  and  asymptotically  normally  distributed.   The  asymptotics 
are  in  terms  of  T,  the  number  of  vector  responses  observed  in  time,  and 
not  n,  the  number  of  locations  in  the  array. 

Consistent  estimators  of  the  variance-covariance  matrices  of 
both  the  random  error  term,  c^,    and  v_  are  presented  in  Chapter  IV. 

In  Chapter  V,  we  focus  on  inferential  aspects.   Procedures  based 
on  asymptotic  results  are  given  for  testing  hypotheses  and  constructing 
confidence  ellipsoids  for  the  location,  neighbor,  and  distance  (if  appro- 
priate) parameters.   We  also  derive  an  approximation  to  the  variance- 
covariance  matrix  of  the  k-step  prediction  errors  in  using  a  fitted 
general  first-order  autoregressive  model  and  make  the  necessary  modifica- 
tions for  the  case  of  the  fitted  spatial  model. 

We  conclude  in  Chapter  VI  by  presenting  simulation  results  which 
provide  insight  into  some  of  the  procedures  developed  in  earlier  chapters. 
We  also  analyze  a  real  data  set. 
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1.5  Notation  and  Format 

Since  notation  in  time  series  work  can  be  quite  cumbersome,  we 
summarize  our  notational  system  in  Table  1.1. 

From  time  to  time,  we  introduce  certain  assumptions  and   as  we 
introduce  each  one,  we  give  the  rationale  for  it.   It  is  to  be  under- 
stood that  the  assumption  is  in  effect  for  the  remainder  of  the  paper. 
At  the  end  of  each  chapter,  we  list  all  assumptions  introduced  in  that 
chapter. 

1.6  Review  of  Assumptions  Introduced  in  Chapter  I 
The  assumptions  introduced  in  Chapter  I  are  summarized  in  Table  1.2, 
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Notation 


Table  1.1 
Notation 


Interpretation 


i* 

A  , 
'J 

A.  . 
ij 

A   .  . 
c,ij 

{A..} 

ij 

(ABC) . . 

A' 

|A| 
A  <S>  C 

I 
n 


{x  }"  . 
m  m=l 

f(0 


x  =  x 


-»  X 


rT"r 


\(H.  S) 


N(v,o2) 


row  i  of  matrix  A 

column  j  of  matrix  A 

the  element  in  row  i  and  column  j  of  matrix  A 

the  element  in  row  i  and  column  i  of  matrix  A 

J  c 

the  matrix  comprised  of  the  A   's 

r 

the  element  in  row  i  and  column  j  of  matrix  ABC 

A  transposed 

the  determinant  of  the  matrix  A 

the  Kronecker  product  of  A  and  C 

the  n  x  n  identity  matrix 

the  vector,  x 

u   ,th  , 
the  1   element  of  x 

_u   -th   ■■ 

the  l   element  of  x 

— c 
the  sequence,  x  ,  x  ,  x  ,  ... 

the  function,  f 

a  particular  value  of  the  random  variable  x 

x   converges  to  x  in  probability 

x,^  converges  to  x  in  distribution  or  law 

convergence  of  a  sequence  of  constants 

is  approximately  equal  to 

is  distributed  as 

the  k-variate  normal  distribution  with  mean 
_U  and  variance-covariance  matrix  £ 


13 


Table  1.1  (Continued) 
Notation Interpretation 

8  the  true  value  of  9  when  9  is  a  parameter 
o 

9  an  estimator  of  9  based  on  T  observations 
IR                      k-dimensional  real  space 

iff  if  and  only  if 

gib  greatest  lower  bound 

in  3.2.1  in  Section  3.2.1 

in  (3.2.1)  in  equation  (3.2.1) 

Al  assumption  //l 

CI  condition  //l 

Rl  result  //l 

Smith  (1975:27)  page  27  of  the  reference  authored  by  Smith  and 

published  in  1975 
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Table  1.2 
Assumptions  Introduced  in  Chapter  I 
Section Assumption   


1-3  Al:   For  all  w.  . ,  0  <  w   <  1. 

—  ij    "  ij  " 

1*3  A2:   For  all  i,  w. .  =  0. 

—  XI 

1,3  Al:   The  weights  are  scaled  to  add  to  unity  for  each 

n 
location.   That  is,   E  w  .  =  1  for  all  i. 

j=l 


j=l  lj 


1.3  A4 


There  are  at  least  three  locations  in  the  array. 
If  there  are  exactly  three,  the  array  is  not  in 
the  form  of  an  equilateral  triangle. 


1,3  A5:   For  a  first-order  autoregressive  spatial  time 

series,  the  model  in  (1.1.1)  is  such  that  B  =  B 
where  ] 


and 


B_  ..  =  a        for  all  i 

1,11 


Br  . .  "  b  w      for  all  i  and  j  ^  i, 


CHAPTER  II 
ESTIMATION  OF  MODEL  PARAMETERS 

2.0  Preamble 
In  this  chapter,  we  will  consider  estimation  schemes  for 
parameters  other  than  the  variance  and  covariance  terms  in  the  special 
first-order  autoregressive  model  introduced  in  Chapter  I.   Since  the 
estimation  procedures  to  be  introduced  involve  modifications  of  the  usual 
Yule-Walker  (YW)  estimators,  a  review  of  the  YW  estimation  procedure  will 
be  presented  in  Section  2.1.   The  estimation  procedures  for  the  known 
weights  case  and  variable  weights  case  are  presented  in  Sections  2.2  and 
2.3,  respectively.   The  properties  of  the  estimators  will  be  derived 
in  Chapter  III. 

2.1  The  Usual  Yule-Walker  Estimators 
Hannan  (1970:13-15,  326-333)  and  Fuller  (1976:72-73)  are  the 
primary  references  for  the  results  of  this  section. 

Again  consider  the  model  for  the  general  first-order  auto- 
regressive multivariate  time  series, 

Xt  =  BZt_1  +  et,  (2.1.1) 

where  y^,  y_t_1  and  e  are  vectors  of  length  n  and  B  is  n  x  n.   The  follow- 
ing assumptions  are  made. 

A6:   All  roots  of  f(z)  =  |l  -  Bz|  =0  lie  outside  the 
unit  circle. 
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A7_:   The  error  terms,  e  's,  are  independent  and  identically 
distributed  with  mean,  j),  and  variance-covariance 
matrix,  G. 
There  are  three  implications  of  A6  and  A7  that  should  be  noted 
at  this  stage.   The  results  are  given  here  without  proof.   The  first 
is  that 

E(l±>    =  £    for  all  t.  (2.1.2) 

The  second  is  that  v  is  second-order  stationary.   That  is,  the  covariance 
function  has  the  following  property: 

E(Xt  Zg')  -  r(s-t)    for  all  s  and  t.  (2.1.3) 

A  third  implication  of  A6  and  A7  is  that  E   is  independent  of 
y_t_1,yt_2,.  ..,  for  all  t. 

It  is  now  apparent  that  in  making  assumptions  A6  and  A7,  we  are 
assigning  a  stability  to  the  process  in  terms  of  its  first  two  moments. 
It  should  also  be  noted  that  since  our  special  first-order  model  can  be 
included  within  the  general  framework  of  the  model  in  (2.1.1),  A6  and  A7 
will  be  assumed  throughout  for  our  special  model  and  so  results  like 
(2.1.2)  and  (2.1.3)  will  still  follow. 

If  both  sides  of  (2.1.1)  are  multiplied  by  y_  '   and  expectations 
are  taken,  we  have,  after  applying  (2.1.2),  (2.1.3),  A6  and  A7, 

T(-l)  =  Br(0). 
This  leads  to 

b  =  r(-i)r_1(o). 

The  usual  YW  estimator  of  B,  B  ,  is  found  by  replacing  the 
parameters  on  the  right-hand  side  of  the  above  equation  with  their 
"moment"  estimators.   That  is, 
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n_l 


bt  =  rT(-i>r  x(o) 


(2.1.4) 


where 


ri;lj<-k>  ■ 


T-k 


n,       k=o,i. 


_£=1  '£+k,i  yl,j 

This  estimator,  B  ,  defines  a  process  which  satisfies,  with 
probability  one,  the  conditions  for  stationarity  given  in  A6. 

As  was  noted,  an  implication  of  A6  and  A7  is  that  E(y_  )  =  0. 
This  is  somewhat  unrealistic  if  v  is  regarded  as  the  vector  observa- 
tion at  time  t.   Thus,  we  will  let  x  denote  the  vector  observation  at 

— t 

time  t  and  assume  y_  to  be  as  in  A8. 

A8:   Let  y_  =  x  -  u  for  all  t,  where  E(x  )  =  u. 
—       t   — t   —  —  t    — 


The  calculations  necessary  in  (2.1.4)  are  then  carried  out  using 

T 

I      x   . 
t=l   t*1 


xt  -  x>  t=l,2,...,T,  where  x  = 


T.   Hannan  shows  that  this 


mean  correction  does  not  change  any  asymptotic  properties  of  interest  in 
our  work.   Consequently,  for  the  remainder  of  the  theoretical  consider- 
ations in  this  paper,  it  will  be  assumed,  without  loss  of  generality, 
that  the  mean  correction  has  already  been  made. 

A9:   Let  v_  =  3^  -  x,   t=l,2,...,T,  be  the  observations 
used  to  fit  the  model  in  (2.1.1). 


2.2  The  Known  Weights  Case 


2.2.1   Introduction 


We  now  work  with  the  special  form  of  the  coefficient  matrix,  B, 


which  we  denote  by  B  where 

r 


L8 


Brjii   *   a>  i=l,2,...,n 

and 

Br   1H    =  bw  i,j=l,2,...,n;    i/j  . 

1  >  ■LJ  -LJ 

The  w's  are  known  weights  for  which  we  assume  Al,  A2,  and  A3. 

Our  objective  then  is  to  estimate  a  and  b  which,  in  turn,  allows  us 

to  estimate  B  . 
r 

Since  the  weights  are  assumed  to  be  known  in  this  section,  it 
would  be  helpful  to  first  consider  some  possible  choices  of  weights. 

2.2.2  Examples  of  Known  Weights 

It  was  stated  in  the  introduction  that  most  of  the  work  done 
with  spatial  processes  on  irregular  lattices  has  been  with  known  weights. 
Ord  (1974)  states  in  the  discussion  on  Besag's  paper  that  one  of  the 
specifications  of  a  spatial  model  arises  when  the  spatial  relationship  is 
in  the  form  of  a  time  lag,  which  is  true  of  our  model  although  the  spec- 
ification is  different.   Consequently,  some  of  the  weighting  patterns 
that  have  been  suggested  or  used  in  the  literature  for  spatial  models 
are  presented  here,  since  they  may  be  appropriate  for  the  spatial- 
temporal  processes  that  we  consider.   If  the  researcher  possesses  con- 
siderable insight  into  the  process  being  studied,  it  may  be  reasonable 
to  completely  specify  an  appropriate  weighting  scheme. 

In  the  following  examples,  the  weights  will  be  presented  in  the 

unsealed  form.   The  simplest  weighting  scheme  is:   w. .  =  1  if  location  j 

is  a  nearest  neighbor  of  location  i,  j  ^  i,  and  w.  .  =  0  otherwise.   (See 

Cliff,  Haggett  et  al.  (1975:  161).)  We  will  refer  to  models  with  this 

weight  structure  as  "closest-neighbor"  models. 

t 


' 
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If  one  had  regional  data  and  wished  to  consider  the  relative  size 
of  the  regions  and  distances  between  their  centers,  one  might  use  the 
scheme, 

q±(j) 

wij  =  ~d7~'       iH> 
u 

where  q± (j)  is  the  proportion  of  location  i's  interior  boundary  which  is 
in  contact  with  the  boundary  of  location  j  and  d. .  is  the  distance  between 
location  i  and  location  j.   (See  Ord  (1975).) 

Both  of  the  above  weighting  schemes  assign  nonzero  weights  only 
to  those  locations  which  are  direct  or  contiguous  neighbors.   If  all 
neighbors  are  to  be  taken  into  account  in  the  weighting  scheme,  one  might 
use  the  following  weights: 

where  6  is  specified  or 

-ad.  . 

w.  .  =  e    J,    j^i 
ij 

where  a  is  specified.   (See  Cliff  and  Ord  (1975).)   For  both  of  these 

weight  schemes,  we  see  that  the  weights  either  increase  or  decrease 

monotonically  as  d   increases,  the  direction  of  change  depending  on 

the  sign  of  6  and  a. 

2.2.3  The  Yule-Walker  //2  Estimation  Procedure 
for  the  Known  Weights  Case 

. 

Since  using  the  usual  YW  estimators  to  fit  the  first-order  auto- 
regressive  time  series  model,  when  B  =  B  ,  does  not  account  explicitly 
for  the  spatial  nature  of  the  process  being  considered,  it  is  desirable 
to  develop  an  estimation  scheme  which  does  account  for  this  spatial  nature 
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in  a  more  direct  fashion.   By  checking  assumption  A7,  we  see  that  no 

distribution  has  been  assumed  for  the  e  '  s.   Since  this  allows  for  a 

— t 

wider  range  of  application  in  our  work,  it  would  seem  advantageous  then 
to  develop  an  estimation  procedure  which  is  distribution-free.   The  dis- 
tribution-free results  given  for  B   in  Section  2.1  and  3.1  suggest  esti- 
mation procedures  which  modify  B  .   There  are  various  criteria  by  which 
one  might  modify  the  usual  YW  estimator  to  reflect  the  spatial  nature  of 

the  process.   One  criterion  is  to  use   as  an  estimator  of  B   those  esti- 

r 

mators  of  a  and  b  which  make  the  B  _  ..'s  as  "close"  as  possible  to  the 

rT,lj  y 

usual  YW  estimators,  the  B   ..'s.   The  criterion  suggests  a  least  squares 

-L » rj 

approach. 

In  this  case,  take  as  the  estimators,  a   and  b   ,  those  values 

of  a  and  b  which  minimize 

n   n 
SS  =  I        I      (B   ,,-B     )2  (2.2.1) 

i=1  j=l      J      J 

where  B      =  aT  and  B      =  b  w.  .,  j^i.   Also  B   is  the  matrix  of 

usual  YW  estimators  given  by  (2.1.4).   (This  subscript  "2"  indicates  that 

these  are  the  YW#2  estimators.   The  significance  of  the  "2"  will  become 

apparent  later.) 

Because  of  the  form  of  B   ,  the  sum  of  squares  function  given 

in  (2.2.1)  can  be  separated  into  two  parts,  the  diagonal  sum  of  squares 

and  the  off-diagonal  sum  of  squares,  as  shown  below: 

n  n   n 

SS  =   E   (B    -a  )2  +  E   Z   (B    -bw.)2.  (2.2.2) 

i=l    ,1X      l  i=l  j=l   r,lj  L    1J 

iH 

The  value  of  a  which  minimizes  the  above  sum  of  squares  is  that  value 
which  minimizes  the  left-hand  component  in  (2.2.2).   A  similar  statement 
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can  be  made  about  the  minimizing  b  -value  relative  to  the  right-hand 
component  in  (2.2.2). 

By  taking  partial  first  derivatives  of  (2.2.2)  and  equating  to  0, 
one  finds  that 

n 

£   B„,  .  . 
i=l  T'X1 
aT2  =  n (2.2.3) 

and 

n   n 


bi2  =  .^  .f/x.ij-ij,  e-*.«> 


where 


n   n 
u.  .  =  w.  ./(  Z        E   w,2„). 
1J     1J   k-1  *-l  M 

From  this  discussion,  we  see  that  the  YW//2  estimators  of  a  and  b 
can  be  found  through  a  two-step  procedure. 

Step  1:   Find  the  usual  YW  estimator  of  B  (and 

hence  B  )  by  using  (2.1.4). 
Step  2:   Find  aT2  and  bT2  from  (2.2.3)  and  (2.2.4), 

respectively. 

2.2.4  The  Yule-Walker  ill   Estimation  Procedure 
for  the  Known  Weights  Case 

In  this  estimation  scheme,  a  property  of  the  weights  is  used  to 
find  another  estimator  of  b.   Let  a   be  the  same  as  a   given  in  (2.2.3). 
To  find  b   ,  note  that 


22 


n   n  n   n 

E   E   B   .  .  =  Z   £   b  w 
1=1  j=l   r'1J    1=1  j-1    ^ 

n   n 
=  b  Z        Z     v.. 
1-1  J-1   1J 
j*i 

n 
=  b  ■  E   1 
i=l 

=  n  b. 

This  suggests  that  b  could  be  estimated  by 


n       n 

Z        Z      B      . 
1-1  j-1     T'^ 

bT1 . 

.1*1 

n 

(2.2.5) 

In  comparing  (2.2.5)  with  (2. 2. A),  it  is  seen  that  (2.2.5)  is 

just  a  special  case  of  (2.2.4)  where  u. .  =  -  for  all  i,  j^i.   Note  that 

this  will  be  the  least  square  estimator  if  the  weights  are  w   =  —==-  for 

ij    n-1 

all  i»  j^i-   This  observation  will  make  our  theoretical  considerations 
in  later  chapters  easier  in  the  sense  that  we  need  only  consider  YW//2  in 
the  known  weights  case. 

The  two-step  procedure  for  the  YW#1  estimators  is  as  follows. 

Step  1:   Find  the  usual  YW  estimator  of  B  by  using  (2.1.4). 
Step  2:   Find  aT1  and  b   from  (2.2.3)  and  (2.2.5), 
respectively. 

2.3  The  Variable  Weights  Case 
2.3.1  Introduction 

In  the  study  of  spatial  processes,  one  may  be  willing,  in  a 
particular  situation,  to  specify  the  form  of  the  weights  but  not  their 
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specific  values.   In  these  situations,  the  weight  function  would  contain 

a  parameter  or  parameters  to  be  estimated  using  the  sample  information. 

We  will  consider  the  weight  function  of  the  form  (before  scaling) , 

-«d.. 
v^Ca)  =  e   1J,    tfl.  (2.3.1) 

(From  this  point  on,  the  notation  "w.."  will  be  reserved  for  the  known 
weights  case.) 

This  weight  function  was  introduced  in  2.2.2,  but  now  a  is  a 
parameter  to  be  estimated  from  the  sample  information.   This  particular 
weight  function  will  be  investigated  in  the  following  section,  after 
which  we  present  procedures  for  estimating  a,  b,  and  a. 

2.3.2  Properties  of  the  Exponential 
Weight  Function 

The  exponential  weight  function  takes  distance  into  account  in 
a  reasonable  way,  exponentially  decreasing  or  increasing  as  distance 
increases  depending  on  the  sign  of  a.   If  a  =  0,  each  neighbor  receives 
identical  weight.   For  these  reasons,  one  can  label  a  as  a  "distance 
effect"  parameter  (assuming  b  ^  0).   Because  of  the  explicit  dependence 
of  these  weights  on  distance,  they  are  suitable  for  both  regular  and 
irregular  arrays  of  locations. 

This  weight  function  has  certain  mathematical  properties  which 
allow  one  to  develop  the  statistical  and  numerical  properties  of  ct 
One  such  property  is  continuity  everywhere  as  a  function  of  a.  Another 
concerns  the  limits  of  the  functions  as  | oo  |  tends  to  °°.   Now, 
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-ad. 


v..(«) 


ij 


n 

E  e 
k=l 


-ad 


ik 


n   a(d.  ,-d.,  ) 
ij   ik 


j# 


£  e 
k=l 
k^i 


(2.3.2) 


Let   c  -  the  number  of  locations  j  for  which  d..  =  min  {d   }  and 

1J   k^i   lk 
f  -  the  number  of  locations  j  for  which  d   =  max  {d   }. 

U    k    ik 

Let  us  first  consider  the  limiting  case  as  a  tends  to  ».   It  is 
enough  to  consider  the  limiting  behavior  of  the  components  in  the  denom- 
inator of  v.. (a)  in  (2.3.2).   For  jH, 


^+00 


a(d..-d.,) 

ij   ik 


iff    d.  .  >  d., 
ij    ik 


<  1    iff 


d..  =  d., 
ij     ik 


.  0    iff    d.  .  <  d., 
ij     ik 


From  the  limiting  behavior  of  these  components,  it  is  clear  that, 


lim  v. . (a)    = 
a->-+  oo   1J 


if   d.  .    =  mm   {d.,  } 

ij         ,  j.-         ik 
J         kfi 


0  otherwise, 


In  the  limiting  case  then,  we  have  the  weights  corresponding  to  the 
closest-neighbor  model  introduced  in  2.2.2. 

Let  us  now  consider  the  limiting  case  as  a  tends  to  -°°. 
By  observing  the  result  in  the  previous  case,  one  might  conjecture  an 
analogous  result  here  with  the  weights  correspondong  to  a  "farthest- 
neighbor"  model.   Indeed,  it  follows  that 


T~  if  d. .  =  max  {d  , } 

fL  «      k      lk 


lim  v.  (a) 
ij 


a-v  -  oo 


0     otherwise. 


V 
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We  thus  see  that  the  exponential  function  allows  flexibility  in  the 
weights  for  the  spatial  process. 

2.3.3  The  Yule-Walker  #2  Estimation  Procedure 
for  the  Variable  Weights  Case 

The  criterion  used  in  deriving  the  YW#2  estimators  here  is  the 

same  as  that  used  in  2.2.3  (i.e.,  least  squares).   As  before,  the  sum 

of  squares  function  to  be  minimized  is  split  into  two  components  as 

follows: 

ss  ■  j,  <BT,ii-V2  +  .",  ." .Pr.ij-Vu'VJ' '      (2-3-3> 

1-1  1=1  J=l       J       J 

where  v.. (a)  is  given  by  (2.3.2). 

Then  a^,   b^ ,  and  a^   are  those  values  of  a  ,  b  and  a   respec- 
tively which  minimize  the  sum  of  squares  function  in  (2.3.3). 

Taking  the  first  derivative  of  this  function  with  respect  to  a 
and  equating  to  0  yields, 

n 

£   Bm  .. 

aT2  =  ~ •  (2.3.4) 

Similar  action  in   terms   of  b  yields, 

n        n 

Z       I     BT    ..    v.. (a) 
1=1  j=l     T'1J      «      T 

bT2="V-^ .  (2.3.5) 


E 
k=l   I 


^CvVJ 


After  seeing  the  form  of  our  sum  of  squares  function  in  (2.3.3), 
it  is  not  surprising  that  our  results  here  agree  with  those  for  the 
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YW//2  estimators  of  a  and  b  in  the  known  weights  case.   Equation  (2.3.4) 
agrees  with  (2.2.3)  and  (2.3.5)  agrees  with  (2.2.4)  if  a  value  of  a 
is  specified. 

A  result  that  will  be  useful  in  simplifying  our  work  is, 
for  j  J   i, 


3v. .  (a  ) 

it T       ,   , 

-  =  v.  .  (ol) 


9aT      vij  V"T 


£  d.,v   (ol)  -  d.  . 
L  k=l   lk  lk  T      1J 


(2.3.6) 


Now  taking  the  first  partial  derivative  of  the  sum  of  squares  function 
in  (2.3.3)  with  respect  to  a  and  equating  to  0  yields, 

V^  .^T.ij-VijK^ijS^ij-^  dikVik(aT^  "  °-       <2'3"7> 

Then  aT2  is  the  solution  to  (2.3.7)  with  b   replaced  by  b   , 
given  in  (2.3.5).   The  resulting  equation  can  be  simplified  a  bit  by 
dividing  through  by  b  „.   This  modification  necessitates  the  assumption 
that  b    is  nonzero. 

A10:   In  the  variable  weights  case,  b   i   0. 

The  necessity  of  this  assumption  is  seen  by  examining  equation 
(2.3.5)  and  the  sum  of  squares  function  in  (2.3.3).   Any  a  -value  which 
would  lead  to  b  „  =  0  in  (2.3.5)  must  be  meaningless  because  it  is 
obvious  from  (2.3.3)  that  if  b  „  =  0,  a  cannot  be  estimated. 

Therefore  a   is  the  solution  to  the  following  equation, 


jx  j!1lBT,irbT2viJ(aT)lviJ(aT)tdiJ  -  j1  dikvik(v]  ■  °.      (2-3-8> 

where  b        is   given  by    (2.3.5). 
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The  YW//2  estimation  procedure  can  be  summarized  in  two  steps. 

Step  1:   Find  the  usual  YW  estimator  of  B  by  using  (2.1.4). 

Step  2:   Find  aT2  and  b^2   explicitly  from  (2.3.4)  and 

(2.3.5),  respectively  after  finding  a    implicitly 

from  (2.3.8). 

Two  problems  arise  with  this  estimation  procedure.   First,  the 
implicit  solution  (and  its  determination)  to  (2.3.8)  is  complicated  by 
the  fact  that  b^   is  also  a  function  of  a     Future  research  would 
indicate  whether  or  not  this  would  be  a  problem  numerically.   In  any 
case  though,  the  evaluation  of  the  statistical  properties  would  be 
more  difficult. 

The  second  potential  problem  occurs  in  the  presence  of  a  weak 
neighbor  effect  (i.e.,  b  close  to  0).   Since  a  distance  effect  can  be 
identified  only  if  a  neighbor  effect  is  present,  it  would  seem  that  it 
might  be  difficult  to  get  a  clear  picture  of  any  distance  effect  if  the 
neighbor  effect  itself  is  small.   This  suggests  that  aT?'s  behavior  might 
be  erratic  (i.e.,  large  variance)  in  the  presence  of  a  weak  neighbor 
effect.   However,  since  the  estimators  of  b  and  cc  are  intertwined  in 
the  YW#2  procedure,  it  appears  that  there  may  be  an  effect  on  both  b„ 


T2 


and  a       in  this  case. 


These  problems,  real  and  potential,  should  serve  as  motivation 
to  consider,  at  least  initially,  other  estimation  schemes  for  which  b  is 
estimated  independently  of  a.   Such  a  scheme,  YW//1,  is  presented  in  the 
next  section.   All  additional  work  for  the  variable  weights  case  has 
been  for  the  YW#1  estimators.   One  aspect  of  future  research  will  involve 
the  study  of  the  YW//2  estimators  in  this  case.   At  this  stage  of  the 
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discussion,  it  might  now  be  clear  why  the  numerical  labels  for  these 
estimation  procedures  were  given  as  they  were. 

2.3.4  The  Yule-Walker  //l  Estimation  Procedure 
for  the  Variable  Weights  Case 

In  2.2.4,  an  estimator  of  b  was  introduced  which  did  not  use 

any  property  of  the  weights  other  than  that  they  were  scaled  to  add  to 

one  for  each  location.   It  is  that  estimator  which  will  be  used  now  in 

the  variable  weights  case.   The  estimator  of  a  is  unchanged.   That  is, 

n 

E  B_  ., 

aTl=— n <2'3-9) 

and 

n   n 

1-lj-l  T'1J 
bTl  =  "^ (2.3.10) 

Then  a^  is  that  a—value  which  minimizes  the  following  sum 
of  squares: 

ss=  U\  ."iVij-NiVV]2-  <2-3.ii) 

1=1  j=i      j        j 

jrt. 

Using  (2.3.6)  and  taking  the  first  derivative  of  the  function  in 

(2.3.11)  with  respect  to  a  and  equating  to  0,  we  have,  after  simplifying, 

bT1  l-l  j!1CBT,iJ-bTlVij(aT)3Vij(aT)Cdij  "  j2  dikVik(aT>]  =  °- 

We  assume  that  bT1  is  nonzero  for  basically  the  same  reasons  as  were 
given  in  the  previous  section. 
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All:   In  the  variable  weights  case,  b   ^  0. 
Tl 

It  follows  then  that  a   is  a  solution  to  the  following  equation: 


Tl 


n   n 


tl±  jy\,iJ-bTiVaT)]vaT)rdiJ  -  kfx  dikwi  -  °-     (2-3-i2> 

The  YW//1  estimation  procedure  thus  yields  an  estimator  of  b  which 
is  functionally  independent  of  a.   The  procedure  can  be  summarized  in 
two  steps. 

Step  1:   Find  the  usual  YW  estimator  of  B  by  using  (2.1.4). 
Step  2:   Find  a,^   and  b   explicitly  from  (2.3.9)  and 

(2.3.10),  respectively,  and  then  a   implicitly 
from  (2.3.12). 

2.4   Review  of  Assumptions  Introduced  in  Chapter  II 

The  assumptions  introduced  in  Chapter  II  are  summarized  in 
Table  2.1. 

Table  2.1 
Assumptions  Introduced  in  Chapter  II 
Section Assumption 


2.1        A6:   All  roots  of  f(z)  =  [ I-Bz |  =  0  lie  outside  the 
unit  circle. 

2.1         A7:    The  error  terms,  £t's,  are  independent  and 
identically  distributed  with  mean,  0_,  and 
variance-covariance  matrix,  G. 


2.1 


A8:    Let  y  =  x,  -  y  for  all  t,  where  E(x  )  =  u. 
—       J-t   -t   —  — t    — 


2.1         A9:    Let  y_t  =  3^  -  x,    t  =  1,2,...,T,  be  the  observa- 
tions used  to  fit  the  model  in  (2.1.1). 

2.3.3  A10:    In  the  variable  weights  case,  b   4   0. 

2.3.4  All:    In  the  variable  weights  case,  b   4   0. 


CHAPTER  III 
PROPERTIES  OF  ESTIMATORS 

3.0  Preamble 
In  this  chapter,  we  will  consider  numerical  and  statistical 
properties  of  the  estimators  developed  in  Chapter  II.   Numerical  prop- 
erties of  existence  and  uniqueness  are  considered,  and  the  statistical 
properties  of  consistency  and  asymptotic  distribution  are  investigated. 
In  Section  3.1,  we  review  those  properties  of  the  usual  YW  estimators 
which  are  beneficial  in  dealing  with  our  estimators.   This  section  also 
contains  a  general  lemma  which  will  be  applied  in  the  remainder  of  the 
chapter.   We  then  discuss  these  properties  for  the  known  weights  case 
in  Section  3.2  and  the  variable  weights  case  in  Section  3.3. 

3.1  Results  for  the  Usual  Yule-Walker  Estimators 
and  Another  Useful  Lemma 

In  terms  of  statistical  properties,  we  will  be  concerned  with 

consistency  (in  probability)  and  asymptotic  distributions.   The  first 

two  lemmas  give  these  properties  for  the  usual  YW  estimators.   Hannan 

(1970:329-332)  gives  proofs  of  the  results  which  lead  to  these  lemmas. 

Lemma  3.1: 

If  v   is  generated  as  in  (2.1.1)  and  A6  and  A7  hold  (that  is,  we 
have  a  second-order  stationary  process),  then  for  all  i  and  j, 

V 

B„,  .  .  >  B   .  .     as  T  -+-  °°, 

T,iJ       o,ij 

where  BT  is  defined  in  (2.1.4)  and  B   is  the  true  value  of  B. 
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Let 

-T  =  (BT,ll'BT,12'""BT,ln'BT,21'--"BT,2n',-"BT,nl'---'BT,nn)  (3*M> 
and 

4'=(Bo,ll'Bo,12'--"Bo,ln'Bo,2r--"Bo,2n'---'Bo,nr---'Bo,nn)-(3-1-2) 

Recall  that  for  our  model,  B  =  B   (A5). 

r 


Lemma  3.2: 

Under  the  same  conditions  as  in  Lemma  3.1,  we  have 

*f(ftr-0  )  ~^»  N   2   [0,  (G^r_1(0))]    as  T  -*■  », 
i  ~ o        n    — 

where  G  is  the  variance-covariance  matrix  of  e     defined  in  A7  and  T(0) 

— t 

is  given  by  (2.1.3). 

The  third  lemma  to  be  stated  provides  a  useful  result  for  the 

asymptotic  distribution  of  well-behaved  functions  of  asymptotically 

normal  statistics.   Rao  (1973:388)  gives  a  proof  of  the  lemma. 

Lemma  3.3: 

Let  8   be  a  k-dimensional  statistic,  (8    ,...,8    ),  for  which 

1 , 1       L ,  k 

^T(6T-e  )  —  Jf.  (0,  Z)  as   T  +   ». 

— i  — o       k.  — 

Let  h  , — ,h  be  q  functions  of  k  variables  and  assume  that  each 

hi  is  totally  dif ferentiable.   Then  the  asymptotic  distribution  of 

^IhjCILj.)  -  "j^f®  )]»  i=l,2,...,q,   as  T  ■>  °°,  is  q-variate  normal  with 

mean  _0  and  variance-covariance  matrix  H  EH',  where 


f  9h  d) 

H  = 


98. 
J 


1 


The  rank  of  the  distribution  is  the  rank  of  H  Z   H' 
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3.2  The  Known  Weights  Case 
As  was  mentioned  in  2.2.4,  we  can  concentrate  on  the  properties 
of  the  YW//2  estimators  and  simply  note  the  slight  adjustments  necessary 
for  the  special  case  of  the  YW//1  estimators. 

3.2.1   Existence  and  Uniqueness 

Recall  from  our  work  in  2.2.3  that 

n 

E      B„,    .. 

1=1      T'11 

aT9   =  (3.2.1) 

ll  n 

and 

*** "  JL  £  VuV  (3-2-2> 

iH 

n        n 


where      u..    =  w.  ./    E        E     w, 2J 
«         iJfW-i  £-1     k^ 


It  is  clear  that  a   and  b   both  exist 


and  are  unique.   This  result  also  holds  for  the  YW//1  estimators,  since 

in  that  case,  u. .  =  —  for  all  i  and  i  ^  i. 
1J   n 

3.2.2  Consistency  (in  Probability) 

From  Lemma  3.1,  we  have  that  the  B   ..'s  are  consistent  (in 
probability).   The  modified  estimators  given  in  (3.2.1)  and  (3.2.2)  are 
just  linear  combinations  of  consistent  estimators  and,  hence,  are  both 
consistent  (in  probability)  since 


as   T  ->- 


n 

n 

aT2 

E      B 
i=l        ' 
n 

ii 

E 
P.    1-1 

na 

0 

n 
=  a 

B 
ro,n 

n 
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and 


n        n  n        n 

b        =     I       Z     B  u       >     l       I     B  »  as     T  -»  » 

i=l  j=l     T'1J    1J  i=i  j=i     ro»iJ    U 

j^  Jrfl 

n        n 
=     £        Z      b  w     u 
i-1  j-1      °   1J    ij 
j*i 


n 

n 

b 
o 

z 

L   i=l 

w.2 
ij 

.1*1 

n 

n 

k=l 

^     w(2„ 
£=1      U 

J^k 

=  b 

0 

• 

Similarly, 


and 


P 

i„,  — *     a  as     T  ->  °° 

Tl  o 


P 
bT1   — >    b  as      T  ■+  ». 

Tl  o 


3.2.3  The  Asymptotic  Joint  Distribution 
of  (aT,  bT) 


To  find  the  asymptotic  distribution  of  (a   ,  b   ),  we  use 

Lemma  3.2  to  satisfy  the  conditions  of  Lemma  3.3.   For  application  of 

Lemma  3.3,  let  9T  =  £T>  8^  =  j^,  k  =  n2 ,  I  =   G  0  r_1(0), 

n 

T.      B.. 

and 

n   n 

h0(i)  -  E   E   B..u. .  , 
2      i=1  j=1   U  xj 
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where  u. .  is  given  after  (3.2.2).   Then 


H2  = 

pti 

±m 

I  9e 

J     '       n 

rri     " 

N 

-    0     ... 
n 

0 

= 

0  Uj„    ... 

ul 

n 


r~i       n 

o-o  ...  0 
n 

u   0  u    . .  .  u 
21    23      2n 


f 
..   0 

.  .  u 
nl 


n 


0   I 
n 

u    ,  0 
n  n-1 


v 


(3.2.3) 


Since  the  elements  in  H„  are  all  constants,  it  follows  that  h..  and  h„ 


are  totally  differentiable.   Now, 
and 


V%' 


T2 


VV  =  bT2 


From  our  work  in  3.2.2,  we  have 


and 


h.  (B  )  =  a 
1  o  o 


h_(B  )  =  b  . 
2  -to     o 


Applying  Lemma  3.3  yields  the  result  that 
•T~[(aT2,bT2)  -  (ao,bQ)]  -2-+    117  (0,H  LH  ')     as  T  ■*■   », 


-,-1 


where   E  =  G®T  (.0)  and  H   is  given  by  (3.2.3).  The  univariate  asymptotic 

distributions  of  both  a   and  b    follow  easily  from  the  joint  result. 
It  also  follows  that 

/T[(aT1,bT1)  -  (aQ,bo)]  -2*  ^(O.H^H^)  as  T  •»- <», 

where  I    =  G®T   (0)  and  Hn  is  the  same  as  H„  except  u..  =  —  for  all 

1                2  v  ij    n 

i  and  j  ±   i. 


35 


3.3  The  Variable  Weights  Case 
As  one  might  expect,  it  will  be  more  difficult  to  get  the  properties 
of  our  estimators  here,  since  there  is  no  explicit  solution  for  the  esti- 
mator of  a.   As  was  mentioned  in  2.3.3,  only  the  YW//1  estimators  will  be 
considered. 


3.3.1  Existence  and  Uniqueness 

Recall  from  our  work  in  2.3.4  that 


n 

3T1   = 

E 
i=l 

Yii 

n 

n 

n 

E 

1=1 

\H 

bTl   " 

n 

(3.3.1) 


(3.3.2) 


and  0T1  is  the  solution  to  the  equation, 

n   n  n 

E   E  [B      .  .-b„Mv..(a)]v..(a)[d..  -  E  d.,  v.,  (a)  ]  =  0.      (3.3.3) 
i=1  j=1  T.ij  Tl  ij     xj     Ij   k=1  ik  ik 

It  was  established  in  3.2.1  that  a  1   and  bT1  both  exist  and  are 
unique.   In  order  to  show  the  existence  of  a   ,  we  will  work  with  the  sum 
of  squares  function,  call  it  s  (a) ,  the  partial  derivative  of  which  led  to 
(3.3.3).  That  is, 

s_(a)  =  E   I   [B     -  b  v  (a)]2,  (3.3.4) 

1      i=l  j=l    '  J       J 

where  v.. (a)  is  the  exponential  weight  function  given  in  (2.3.2). 
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Define  the  following  sets: 


and 


N.  =  {locations  i:   d..  =  min  {d.,}}, 

id     kH     tfc 


Q.  =  {locations  j:   j  t   N.,  j^i}, 


F.  =  {locations  j:   d. .  ■  max  {d.,}}, 
1  iJ    v  lk 


P.  =  {locations  j:   j  t   F.,  j^i}, 


Recall  from  2.3.2  that   c.  =  the  number  of  elements  in  N.  and  f.  =  the 
l  li 

number  of  elements  in  F.. 

l 

Theorem  3.4: 

Suppose  the  following  conditions  are  met. 

CI;   The  estimate,  b „- ,  is  nonzero. 

C2:   It  is  not  true  that  the  usual  YW  estimator,  B  ,  is  such  that 

'bm,  —    for  all  i  and  i  e  N. 
Tl  c.  J     i 

0        for  all  i  and  j  z   0. 

l 

or 

"bT1  y~  for  a11  i  and  j  e  F^^ 

0        for  all  i  and  j  e  P, 

l 

C3:   There  exists  a  location  i  for  which  c.  <  n-\   where  n  >3. 

—  l 

Then  there  exists  a  finite  aT1  such  that  s  (ol^)  =  min  s  (a) . 

a 
Proof: 

Let     l^L     =      lim  s    (a)    and     M_      =      lim  s    (a) .      Using   the   results 
a^+oa      l  iZ        a->-oo 

from  2.3.2,    it    follows    that 
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and 


M        =      Z        I        (B  -  — )      +        %        £        B    2  (3   3   5) 

%        i=l  jeN±   \>,ij         c.j      +     i=l  jeQ.    BT,ij  U*J-^ 

n  /"  brAa        n 

M       =     I        I        IB  -fij    +     S        E        B   2        .  (3.3.6) 

i=l  j£F.    v  1,aj  iy        i=l  j£P.        '   J 

J      i  J      i 

From  C2,  we  see  that  both  M_1  and  >L  are  positive.   (They  are  also  fin- 
ite since  the  B   ..'s  are  finite.)   Since  s  (a)  is  clearly  a  continuous 
function  of  a,  we  know  that  if  there  exists  a  finite  a  such  that 
s  (aj  <  M^  and  s  (a  )  <  M^'  then  there  exists  a  finite  a   such  that 
s  (a  )  =  min  s  (a) .   Our  objective  is  to  show  the  existence  of  ol,. 
First,  we  will  show  there  exists  a  finite  ex..  such  that  s  (cl.  )  <  M_,1 . 
Let  e  >  0  be  such  that 

e  <  min  fmin  (~}  ,  R,  ,  R-")  ,  (3.3.7) 

V  i   ci     7  ■       V 

where  R  and  R  are  the  positive  roots  of  two  quadratics  to  be  introduced 

later  in  (3.3.15)  and  (3.3.18).   Since   lim  v.. (a)  =  0  for  all  i  and 

a+  +°°  ij 

j  £  Q.  and,  for  finite  a,  v..  (a)  >  0  for  all  i  and  j  j*  i,  it  follows  that 
there  exists  a  finite  positive  ol  such  that  for  all  finite  a  >  a.  , 

0  <  v.. (a)  <  e    for  all  i  and  j  e  Q..  (3.3.8) 

We  also  know  that   lim  v.. (a)  =  —  for  all  i  and  j  e  N..   Since  these 

a->-  +  <»  ij      c.  l 

weights  sum  to  1  for  each  location,  it  follows  that  for  all  i  and  j  £  N., 

v..  (a)  — *■  —   from  the  left  as  a  ->  +°°.   In  addition  to  satisfying  (3.3.8), 
ij        c. 

ol  can  be  chosen  to  also  satisfy  the  condition  that  for  all  finite 

a  >  a,  ,  0  < £<vf. (a)  <  —  for  all  i  and  i  E  N. .   Consider  the 

L      c.        ij      c.  l 

interval,      S   =  [ol  ,  2a].    Since  S   is  compact,  there  exists  w  >  0 
such  that 
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w  =   min    min  v.  .  (a) 
i,j£Q1  aes     1J 

and  there  exists  w  .  >  0  such  that  w  .  <  —  and 
ui  ui   c 

1 

w  .  =  max   max  v. . (a)  . 

i  EN,  aeS     1J 
J   l 


Thus,  for  all  a  e  S, 


and 


0  <  WL  *  vij(a^  <  e   for  a11  i  and  j  e  Q  ,    (3.3.9) 


0  <  —  -e<v..(a)<w.<— 

C.  11         Ul   c. 

i  i 


for  all  i  and  j  e  N..   (3.3.10) 
J    1 


From  (3.3.10),  we  have 


0  <  —  -  w  .  £  —  -  v.  .  (a)  <e<  — 

c .      Ul     C .      11  c 

i  i  i 


for  all  i  and  j  e  N  .   (3.3.11) 


We  now  claim  that  for  all  a  e  S,   M^     -  s  (a)  >  0.   From  (3.3.4)  and  (3.3.5), 


n  /  b„,x2  n 


*L    -s  (a)  =     E       I     U       — 111     +     e 

^    T       i=i  jeNAT'iJ  ci>     i-i  j 


E        B2     . 
6Q         T'1J 


2_     5         _       fo  _u      „       /™m2 


-    E      £      IVid"bTlvij(o)1    -   S      Z     tVu-^ij(o)l 


i=l  j£N.  i=l   jeQ. 

l  l 


2b      {    E      E      B        v     (a)-   E      E     b         r-L-v ..(a)] 

T1  li=l   j£Q.      T'1J    1J  1=1   JEN     T'XJ    ci        «         J 


+  b2    (z       E      t-^-  -  v  2  (a)  ]  -    E       E     v.2.(a)\.  (3.3.12) 

T1  U-l  jW±  °i         1J  1=1  J^Q.    1J      J 


Case  1:   Suppose  that  b   >  0. 
It  is  enough  to  show 
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i=l  j£Q.   ^  J 


Til  . 


i=l  jeN.  Ci    ij 


+  ll     I       E  B    v  (a)-  E   E  B   .  .[— -  v.  .(a)]}  >  0 
U-l  jEQ  T'XJ  «     1=i  jEN  T,ij  c    ij  j 
l  J   i 


Now  using  (3.3.9),  (3.3.10),  and  (3.3.11),  we  have 


for  all  a   e  s. 
(3.3.13) 


>Tl(  A     E     t^T— v^Ca)]-    S      E     v^  (a J 


i=l   j£N.      i 
l 


<V         ij 


i-1  jeth 


ij 


+   2        Z        ^      BT   iiVii(a)-     E        E      BT   ntr-V^fa)]] 
U-l   jEQ      *»«    ^  i=i  jeN<    T.ij    c.         ij         J 


>b     [  j       E     C-if-w*)-    E       E     e2] 
i=l  jEH,    ci  ul        i-i  jEQ. 

1  J     XX 


+  2 


~  n  n 

E       E     B        w    +    E       E       L  JJ§    e 

Li=l   jEQ.    T»iJ    L         1=1   jEQ        T>U 

1  J   xl 


BT    .  .  >  0 


B^    .  .  <  0 
T,ij 


"  n 

~     l        E  B            E  -     E        I        B^        (— -w      ) 

1=1  jeN.  k*l3           i=1  jeN        T,ij    c          ui 

l  J      i 


BT    .  .  >  0 


BT    .  .  <  0 


=  A1  e2  +  biE  +C1, 


(3.3.14) 


where 


and 


A1  =  -  bT1     E    (n-l-c.), 


n 

E        E 
l=1  JeQ± 

Bm    .  .  <  0 
T,ij 


T,ij 


n 

E        E 

i=l  jew 

i 

BT,ij>0 


T,±j 


C     =  b  E      c.(-V-  w2.) 

1  Tl  1    C.  U  1 

1=1  1 


40 


+  2 


~  n 

n 

E        E        BT    ..wT    - 
Li=l   jeQ.      T'XJ    L 

J     1 

-     E       E 

i=l  jeN. 
J      i 

BT    .  .  >  0 
T,iJ 

B^    .  .  <  0 
T.iJ 

T,ij    c.         ui 


Now  consider  the  following  polynomial, 


fjCx)  =  A^x2  +  B  x  +  C  , 


(3.3.15) 


where  A^  B^  and  C±   are  as  above.   From  C3,  it  follows  that  A   <  0. 
Clearly  B1  £  0.   From  (3.3.10),  we  have  C±   >  0.   With  these  conditions  on 
the  coefficients  of  the  quadratic  in  (3.3.15),  it  follows  that  f  (x)  =  0 
has  two  roots,  one  positive  and  one  negative. 

Let  Rx  be  the  positive  root.   Since  e  in  (3.3.7)  is  such  that 
0  <  E  <  R±,    one  can  conclude  that  the  lower  bound  in  (3.3.14)  is  positive 
and  that  (3.3.13)  is  established. 

Case  2:   Suppose  that  b   <  0. 

It  is  enough  to  show 


bTl(   Z       E      trt-v,2.(a)]-    E       E       v.2.  (a)) 
l.i=l  jeN.    ci  «  1=i  jeQ       iJ      J 


n 


n 


+  2<    E        E     B  v      (a)  -    E        E      B  T,.[-f—  v.  .  (a)  ]  \  <  0      for  all  a   e  S, 

U=l  j£Qi   1»1J    *J  i=l  jEN.    T'^    ci        XJ  J 

(3.3.16) 


Now  using  (3.3.9),  (3.3.10),  and  (3.3.11),  we  have 
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E        E        [~r-v.2.(c0]   -     E        E        v.2.  (a)] 


Tl  I     .     ,      .     „  C.2  11 

1=1   j eN.         i  J 

J      i 


i=l  JEQ.      1J 
l 


n 


E       B     ..[-f-  v     («)] 
1=1  jeN       1»XJ   ci         XJ        J 


.1=1  JeQ, 


T,ij   ij 


<  b 


Tl 


-  n 

n 

E 

E 

JEN. 

J      l 

Ul 

-      E 
i=l 

E 

1*\ 

+  2 


n  n 

E        E        B„  ,  .  e  +     I        E        P,      .  .   wT 
Li-1  m.     T'1J        i=l  jeq.     T^J     L 


BT    .  .  >  0 


K  . .  <  o 

T,lj 


n  1 

-     E       E       B    ■ . ,  (—  -  w  , ) 
1=1  jen.      T'iJ    ci  ul 

BT    .  .  >  0 
T,iJ 


1        Z        BT    ii    e 
1=1   j EN.         '    J 

B„,    .  .  <  0 
T,ij 


=  A2  e^  +  B2e   +  C2, 


where 


A2  =  -bn     E      (n-l-Ci), 
i=l 


(3.3.17) 


and 


B„   =   2 


C„   =  b 


r- 

n 

n 

S        I       B 

_i=l  JEQ.        '1J 
J      l 

E       E       B 

i=l  jen.         'i:l 
J      l 

B„    .  .  >  0 

BT    .  .  <  0 

n 

T,      E      c.   w  2. 
Tl    .           i      ui 
i=l 

"*  n 

n 

2 

E        £        B 
J.-1   JEQ.       T'1J 

w     -      E        E 

1=1    JEN. 

BT,ii<0 

BT    .  .  >  0 
T,lj 

Yij(^ 


w    .) 

Ul 
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Now  consider  the  following  polynomial 

f2(x)  =  A2x2  +B2x+C2.  (3>3>18) 

From  C3,  it  follows  that  A2  >  0.   Clearly  B2  >  0.   From  (3.3.10),  one 
can  conclude  that  ^   <  0.   With  these  conditions  on  the  coefficients  of 
the  quadratic  in  (3.3.18),  it  follows  that  f^x)  =  0  has  two  roots,  one 
positive  and  one  negative.   Let  R2  be  the  positive  root.   Since  e  in 
(3.3.7)  is  such  that  0<  e  <  r^  one  can  conclude  that  the  upper  bound  in 
(3.3.17)  is  negative  and  (3.3.16)  is  thus  established.   Since  b   ^  0 
by  CI,  all  cases  have  been  considered.   Thus,  there  exists  an  c^  belong- 
ing to  S  (and  hence,  finite)  such  that  s  (a  )  <  M 

To  complete  the  proof,  it  is  necessary  to  show  that  there  exists 
a  finite  o^  such  that  s^)  <  M^.   However,  a  check  of  the  form  of  M 
in  (3.3.6)  reveals  that  the  details  would  be  analogous  to  those  of  the 

part  just  completed.   Finally  one  can  conclude  that  a  =  a  if  M   <  M 

3  1  Tl         T2 

and  a3  =  o<2  otherwise.      Thus,    there  exists   a   finite  a3  such   that 

sT(a3)  <  MT1  and  s^)  <  HJ2   and  the  proof  is  complete. 

Before  going  on,  some  discussion  of  the  conditions  of  this 
theorem  is  in  order.   The  first  condition  is  just  All.   The  second  condi- 
tion is  an  assumption  which  seems  to  be  reasonable. 

A12:   The  usual  YW  estimator,  BT,  is  not  such  that 


\±s 


or 


BT,iJ 


for  all  i  and  j  e   N 


i 


for  all  i  and  j  e  Q 

i 


for  all  i  and  j  e  F 


i 


for  all  i  and  i  c  P 

i 
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It  appears  quite  unlikely  that  one  would  observe  a  B   of  either  heavily 
restricted  form  excluded  by  A12.   It  should  be  noted  that  the  second 
condition  alone,  does  not  imply  that  c^  is  finite,  only  that  M   and  M 
are  both  positive.   Finally,  C3  is  met  by  assuming  A4.   That  is,  the 
array  has  either  more  than  3  locations  or  3  locations  not  forming  an 
equilateral  triangle. 

Only  the  existence  of  a^   has  been  established.   Although  we 

were  unable  to  show  its  uniqueness,  most  of  our  empirical  investigations 

would  support  such  a  conjecture. 

3.3.2   Consistency  (in  Probability) 

Since  aT1  and  bT1  in  the  variable  weights  case  are  the  same  as 
their  counterparts  in  the  known  weights  case,  we  already  have  from 
3.2.2  that 


and 


P 

aT1  >  a     as   T  ■*   °o 

11       o 


p 

b_.  »  b     as  T  -*■  °°. 

11       o 


As  would  be  expected,  to  arrive  at  the  consistency  of  a 

j  Tl 

requires  more  effort.   A  general  theorem  will  be  presented  and  after  its 
proof,  it  will  be  applied  to  our  particular  problem. 

Let  ^T  be  an  estimator  of  9^  based  on  T  observations,  <j>  a  param- 
eter of  interest,  and  f(-)  a  finite-valued  function  of  6  and  <}>.   We  define 

hT(4>)  =  f(eT,<j>) 


and 


hU)  =  f(e  ,<(,) 

o         — O 
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Theorem  3.5: 


Suppose  the  following  conditions  are  met. 

Clj,       There   exists    a   finite   <J>     such    that   h    (A  )  =  min  h    (A)  , 

*•   *     0   T 

where  <J>   is  taken  to  be  an  estimator  of  *  . 
1  o 

C2:    The  function,  h  (•),  is  continuous  everywhere. 

_C3:    The  function,  h  (•),  has  a  unique  minimum  at  A  . 

0  o 

C4:     (i)  The  limit  of  hQ(A)  as  $  ■*   +»,  M  ,  is  finite  and 

greater  than  h  (A  ) . 
o   o 

(ii)  The  limit  of  hQ(A)  as  A  ->  -°°,  m  ,  is  finite  and 

greater  than  h  (A  ) . 
o  o 

C5:   We  have  that  sup|h  (A)  -  h  (A)  |-^»  0  as  T  +  ». 


'T    '  To 


Then   A  >  0   as   T  -»■  «>. 


Proof: 


This  proof  is  patterned  after  one  by  Parzen  (1962)  and  consists 
of  establishing  the  following  two  results. 

Rl:    As  T  +  oo  h  (A)  -A-  h  (A  ) . 
o  T       o   o 

R2:    For  every  E  >  0,  there  exists  n.  >  0  such  that  |(J>  -A|  >  £ 

implies  that  |h  (A  )  -  h  (A)  I  £  n. 
o   o      o    ' 

Applying  Rl  to  R2,  with  <j>  in  place  of  A,  yields  the  desired  result. 
Proof  of  Rl: 


If  it  can  be  shown  that 

'W  '  W'   - sup   lhT(<,°  ~   h0<*>!» 


it  follows  that 


'W'W'  *  2  sup  lhT(*)-h0(*)|,  (3.3.19) 


* 


since 


|ho(*T)-ho(*o)|  S  |ho(AT)-hT(AT)|  +  |hT(0T)-ho(Ao) 
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Case  1:   Suppose  that  h_(<j>_)  -h  (<ji  )  *  0. 

XT     o   o 

Then  it  follows  from  CI  that 

0  <  h  (<j>  )  -h  (<J>  )  =  inf  h_(<J>)  -h  (<f)  ) 

11     OO       ii        oo 

<  sup  |h  (<J>)  -h  ((J,)  | 
Therefore, 

|hT(*T)  -  hQ((j)o)  |  £  sup  |  hT(<J>)  -  hQ((J))  |  . 

Case  2:   Suppose  that  h  (cf>  )  -h  (0  )  <  0. 
From  C3,  It  follows  that 


0  <  ho((j>o)  -hT(<J.T)  =  inf  hQ((J.)  -hT((f>T) 

<  ho(<j>T)-hT(<J>T) 


ho((J)T)-hT((J,T)| 


<  sup  |h  (<())  -h  (<J>) 


Thus, 


|hT((f>T)  -ho(<fro)  |  *  sup  |  hT(<J>)  -ho((|>)|. 
Therefore,  from  our  comments  prior  to  (3.3.19),  we  can  conclude  that 

'VV'W' s  2  supl  hT(*)-ho(*>l- 

Applying  C5  to  this  result  leads  to  the  conclusion, 

P 
ho(V  — *ho(V    aS  T  "*  °°-  (3.3.20) 
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Proof  of  R2: 

The  proof  will  be  by  contradiction.   Suppose  there  exists 
e  >  0  such  that  for  every  n  >  0,  there  exists  4  such  that  |4  -  <J>|  2  E 

and  |ho(*  )  -hQ(4)|  <  n.   Now  choose  a  sequence  of  n*s  of  the  form^ 

m 

and   let   ^  be    the    corresponding   Rvalues.      We    then   have   a  sequence 

{Vm=l      for  which    l*0  -*ml    *   E  and    |h   (4  )  -  h   (4   )|    <  ^  for  some 

u   l"  o   o    o  m  '    m 

£  >  0.   Since,  from  C4,  the  limit  of  h  (4)  as  a  ->  -H»,  M  ,  is  finite, 

one  can  pick  a  finite  4  large  enough  to  get  arbitrarily  close  to  M  . 

A  similar  statement  can  be  made  about  small  4  and  M  .   Now  since  M  and 

M2  are  both  greater  than  h  (4  )  and  h  (4  )  ->  h  (<J)  )  as  m  -  .,  there 

o   u       o   m     o   o 

exists  M,  4  ,  and  4  such  that  for  all  m  >  M,  4  e  5  -  [A  ,  4  -el 

z  m         1    o 

union  [4q  +  e,42].   Since  5  is  compact  and  hQ(4)  is  a  continuous  func- 
tion of  4,  from  C2,  it  follows  that  there  exists  4  e  S   such  that 

ho(<i)3)  =  ho(*o)-   This  contradicts  C3  and  so  R2  is  established  and  the 
proof  is  complete. 

In  order  to  show  the  consistency  of  0L   the  conditions  of 
Theorem  3.5  will  now  be  verified  for  our  particular  case.  We  have: 

eT=(BT)12...,BT5ln,BTj21,BTj23,...,BTj2n,...,BT5nl,...,BT>nn_1,bT1)', 

(3.3.21) 

4=(Bro,12'---'Bro,ln'Bro,2rBro,23,--"Bro,2n---'Bro,nr""Bro,nn-rbo)' 

(3.3.22) 
(where  B^..  =  aQ  for  all  i  and  B^..  =  bo  v±j  (a,)  for  all  i  and  j  ^  i, 
with  v   (a)  given  in  (2.3.2)), 

o    o 
and 

n    n 

f(6,4)  =  e   E  [B..  -  b  v..(a)]2. 

i-1  j=l   1J      « 

3H 
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This  implies  that 

n   n 
hT(<}>)  =  E   I   [B   .,-b   v  (a)]2  =  sT(a),  (3.3.23) 

i=l  j=l   1,1J   1X  1J        x 

which  agrees  with  the  notation  in  (3.3.4),  and 

n   n 
h  (<(,)  =  E   I   [B      -  b  v  (a)]2  =  s  (a).  (3.3.24) 

1=1   i  =  1      LU>-LJ       u    J-J  O 

Clearly  f  is  finite-valued  and  we  will  now  check  the  other  conditions. 

CI:   In  our  case,  A  =  a     This  condition  was  established 
in  Theorem  3.4. 

C2:   The  function,  s  (•),  is  clearly  a  continuous  function  of  a 

o 

since  v. .(•)  is  continuous  for  all  i  and  j  i   i. 
C3:    From  (3.3.24), 
n   n 


art(a)  =  I   E   [B   .  .  •-  b  v.  t  (a)  ]2 
j-l 


°     i=l  j=i  ro,±i        °    ij 


n   n 
=  b2  t       I      [v. .(a  )  -  v.. (a)]2.  (3.3.25) 

°  1=1  j-l   ij  °     ij 
j/i 

Therefore,  s  (a  )  =  0  and  so  a   is  a  minimum  of  s  (•). 
o   o  o  o 

Now  suppose  SQ(a  )  =  0.   From  (3.3.25),  this  implies  that 

bo^Vij^ao^  "  Vij^ai^  =  °  for  a11  i  and  J  ^  i*   Since  consideration 

of  a   is  meaningful  only  if   b   ^  0,  it  seems  reasonable  to  consider  the 

consistency  of  a   only  if  the  following  assumption  is  made. 

A13:   The  true  value  of  b,  b  ,  is  nonzero. 

o 

With  A13,  it  follows  that 

Vij(ao)  =  Vij(ai)    for  a11  i  and  3  ^  i-  (3.3.26) 
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Now  pick  a  location  i  for  which  there  exists  neighbors  j  and  k  such 

that  d   j   d     Such  a  situation  exists  from  A4.   Since  v. . (a)  is  of 

ij 
the  form  given  in  (2.3.2),  we  have  from  (3.3.26), 

Vij(0to)   vjj  (V 
Vik(ao)  '  Vik(V   ' 

which  implies  that 

eVdik-V     ai(dik-V 

c  =  e 

> 

which  implies  that 

ol  ■  a  . 
1    o 


Thus,  a  is  a  unique  minimum  of  s  (•). 
u  o 

C4:   From  results  in  2.3.2,  we  have, 


lim  s  (a)  =  b 

0->  +<»   ° 


U=l  jeN.   «   °   ci    i=1  jeQ_  U   o  j 


where  c . ,  N.,  and  Q.  are  defined  in  3.3.1.   This  limit  clearly 

exists  and  is  finite.   Since  bQ  4   0  by  A13,  the  only  way  for  H±   to 

equal  0  is  for  a  to  equal  -H».   However,  if  one  really  felt  that  a 

o 

equalled  -H»,  one  could  use  the  identical  closest-neighbor  model  in  the 

known  weights  case  (see  2.2.2).   The  considerations  are  similar  in  the 

case  of  M2  and  so  it  would  seem  reasonable  to  assume  that  a  is  finite 

o 

when  the  variable  weights  model  is  employed. 

A14:   The  true  value  of  a,  a  ,  is  finite. 

o 

With  this  assumption,  C4  is  established. 

C5:   Using  (3.3.23)  and  (3.3.24),  it  follows  from  Al  that 


n        n 


|sT(a)-s(a)|     -\i        ?   [B        ,-b     v     (a)]2-  ?       ?   [B        .  .-b  v.  .  (a)  ]2 1 
i=l  j=l     1»1J      ll  XJ  i=1   i=1      r°»iJ      o  ijv   /J    I 


i-1  J-l 

J*1  3H 
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n        n  n        n 

=    I    E        E    (BtV-b   2    ..)+2   Z        E     v..(a)(b   B  -b      B  ) 

i=1  j=1     T.ij      ro.ij  .=1  j=1     ij*         o  ro,ij     Tl°T,ij' 


+      £       Z     v2(a)(b2-b  2)| 
i=l  j=1     «  Tl     o      i 

n       n  n       n 

1=1      J=l 

Thus, 

n        n 

sup|s    (a)-s    (a)  |    <     I       I  [|  BT2 .  .-B   2    .  .  I  +  2  lb  B        .  .-b^B  |  +  |b2-b2|l 

a       T  °  1=1  j=l     T,iJ      ro'1J  '    o   ro.ij      TIT.ij1       '    Tl   Do  '  J 


(3.3.27) 
T.ij      "^Bo,ij        Bro,ij      as  *  "  for  a11   *   and   1    an< 


P 
From  Lemma   3.1,    B  *  B„    ,  .   =  B_    .,      as      T  ->  oo   for   all   ±   and   i    and 


p 
from  3.3.2,   b       — »  b        as     T  ->  oo.      Therefore 
11  o 

2  P  2 

BT,ij   ~*Bro,ij  aS  T  ""  °°  for  a11   i   and  J» 

bTl  bo2  aS  T  *  °°    ' 

and 

bTl  BT,lj  ~^bo  Bro,ij     aS    T  *  °°     for  a11  i   and  J' 
Applying  these  results  to  (3.3.27),  we  see  that  the  upper  bound  con- 
verges to  zero  in  probability  as  T  ->  °°,  and  hence, 

sup  |s  (a)  -  s  (a)  |  *■  0    as  T  -+■  ». 

a 

All  conditions  of  Theorem  3.5  have  been  satisfied.   So  the 
consistency  of  a   has  been  established. 
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3.3.3  The  Asymptotic  Joint  Distribution 
of  (aT1,bT1,axi) 

The  format  of  this  section  will  be  similar  to  that  by  which  a 

Tl 

was  shown  to  be  consistent  in  the  previous  section.   Two  lemmas  will  be 
presented  first.   These  will  be  followed  by  a  general  theorem  and  an 
application  of  it  to  our  problem. 

The  first  lemma  is  just  a  specific  statement  of  the  multivariate 
Taylor's  formula.  A  more  general  statement  of  the  formula  and  its  proof 
can  be  found  in  Fleming  (1965:44-49). 

Lemma  3.6; 

Let  h(-)  be  a  function  of  _£  =  (<j>  c|> _,...,$  )'.   If  ^M  exists 

i   ^      q         9^ 

and  is  continuous  everywhere  for  j  =  l,2,...,q,  and  both  &     and  A 
belong  to  IRq,  then  there  exists  xT E  IRq  for  which 

q 


-°      j=1  ^FL_   t,j  %,j 


where  ^  =  ^  +  sh,  0  <  s  <  1,  and  h  -  _£  -  j>  . 

The  second  lemma  is  a  result  on  asymptotic  distributions.   This 
lemma  can  be  found  in  Fuller  (1976:199). 

Lemma  3.7: 

T       D  P 

Let  zT  — >  z   as   T  -v  »  and  ^ >  A  as   T  ■*   »,  where  z  is  a 

random  k-vector  and  A  is  a  nonsingular  k  x  k  matrix  of  constants.   Then 

Aj      z.T  *  A  .      z  as   T  -*•  °°. 

Before  stating  and  proving  the  general  theorem,  we  make  the 
following  definitions.   Let  j^  =  (6^,...,^  fc)'  be  an  estimator  of 
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8   -  (6   i'--->9„  ,  )r  based  on  T  observations  and  <f>  =  (4   , ,*   „,..., &   )' 
o     o,x      o,k  -^     o,l  ro,2      o,q 

be  a  parameter  of  interest.   For  i  =  l,2,...,q,  let  f.(«)  be  a  function 
of  _9  and  _<£  and  let 

ho,l<*>-fi<§o'0' 

hT)i(i)  =  f.(eT,i), 

and 

r0).(8)  =  f.(e,^). 

Let  £(&,$) ,  h^Ojj),  hT(^),  and  r  (6)  be  the  corresponding  vectors  of 
functions. 

The  purpose  of  the  following  theorem  is  to  allow  one  to  find 
the  asymptotic  distribution  of  estimators  when  some  of  them  are  implic- 
itly defined. 

Theorem  3.8: 

Suppose  the  following  conditions  are  met. 
CI:    The  statistic,  _0  ,  is  such  that 

»¥(eT-e  )  -2-~n.  (o,e)       as    t  ■*  ». 

— i  — o        k  — 
C2:   An  estimator  of  j>   is  A,  for  which  h  (<$>,   )  =  h  (A  ) . 
C3:   The  statistic,  <f>   .,  is  a  consistent  (in  probability) 


es 


tiraator  of  6      .  for  all  i  =  l,2,...,q. 
o,i  ,H 


C4:   All  partial  derivatives  of  In   .(•)  exist  and  are  contin- 
—  T,i 

uous  everywhere  for  all  i  =  l,2,...,q.   We  let 


T,ijw     9(f). 
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C5j_  All  partial  derivatives  of  h    (•)  exist  and  are  continuous 

0,1 

everywhere  for  all  i  =  1,2,..., q.   We  let 

3h   .  (4) 
ho,iJ(^  =-4~~  ^  Mo=  {ho,ij(V}' 

C^6:   The  matrix,  M  ,  is  nonsingular. 

C7:   We  have  that  sup  I  h„,  .  .  (A)  -  h   .  .  (4)1  -£-*•  0 

*    T.ij  *■  o,ij  y^{ 

as   T  +   00   for  au  ±   and  j# 
C8:   All  partial  derivatives  of  r   .(•)  exist  and  are  continuous 

0,1 

everywhere  for  all  i  =  l,2,...,q.   We  let 


R  =  { 


8r   .(6) 
0,1  ~ 


Then 


0      98 


Z)  .,  ,„  ,..  -1   -1 


} 


^W *tf(0,(MQ   RQ)E(Mo   Rq)')   as   T  +  00. 

Proof: 

Several  intermediate  results  will  be  necessary  to  reach  the 
desired  conclusion. 

Rl:   For  every  i  =  l,2,...,q,  there  exists  a  random  vector, 
x  ,  for  which 

hT,i<±T>  =  hT,i^  +  jf1hT,^(5riK*T,j-*o,j>' 
where  xTi  =  ±q+   s^h,  0  <  s±   <   1,  and  h  =  &     -   A  .  . 
Proof  of  Rl: 

For  a  fixed  T,  use  C4  and  apply  Lemma  3.6  to  h  .  (&)    for 

■*■  >  -*- 

i  =  l,2,...,q.   The  randomness  is  due  to  the  fact  that  h   .  (A_)  is  a 
random  function.   Using  Rl ,  we  have  the  following  system  of  equations 


53 


hT(iT)  =  hT(<JO  +  M^-jy,  (3.3.28) 

where  Mj  =  {hT  ^fej^)}- 

R2:   For  all  i  -  1,2 q,   Xrr.  — ->  4   as  T  -»•  »  in  the 

~ li      — o 

sense  that  for  all  E  >  0, 

P(l*ri,r*0,l'  <  £"-"lXTl,q-(},o,ql  <  E)  ■*•  X  aS   T"°°- 
Proof  of  R2: 

From  Rl,  it  follows  that  for  all  i  =  l,2,...,q,  jr   -  4 
=  Sj^ll  "  ^(Af-JL).  where  0  <  s.  <  1.   Thus,  if 

ET  =  (kT>1-<(>0>1l  <7".....  l*Tfq-*0>q|  <  ~ }  occurs,  then 

FT  =  {|xTi)1-*oa|  <  £,...,  |xTijq-c()0)q|  <  e}  also  occurs. 

This  implies  that  P(E  )  <  P(FT).   From  C3,  we  have  that  4m  -^— ><t> 

1       i  T»J      o,j 

as  T  +  °°   for  all  j  =  1,2, ...  ,q.   This  implies  that  P(E  )  ■*   1  as  T  -»•  », 
which,  in  turn,  implies  that  P(F  )  ->-  1  as  T  -»■  °°,  which  establishes 

the  result. 

p 

R3:   For  all  i  and  i,  M_  .  .  >M      as   T  ■*   °°. 

—  T,Ij       o,ij 

Proof  of  R3: 
For  all  i  and  j , 

-sup|hTji.(1)-h0)..(1)|+|h0j..(iSr.)-h0j..(V|. 

p 

Now   R2   and   C5   imply    that   h      ..(x_.)  Mi      .  .  (4   )      as      I  -*■  ».      Combining 

o,ij  — li        o,ij  -"-o  & 

this  result  with  C7  establishes  the  result. 
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R4:   As  T  ■+  °°, 

~  fit*   CIJ  -  r  (6  )]  -^N   (0,R  i:   R'). 
^0   J-    ^o  — o         q  —  o     o 

Proof  of  R4; 

From  C8,  we  have  that  r^±<0  is  totally  dif ferentiable  for  all 
i  ==  l,2,...,q.   The  result  follows  by  an  application  of  Lemma  3.3  to  CI. 
R5:   As  T  ■*   ». 


/f  Mr(^T-^o)  -2-^ff  (0,R   E  R') 


Proof  of  R5: 

From    (3.3.28)    and   C2,    it    follows    that 
A  MT(iT-^)  =  -/Tth^C^)    -  hTdT)  ] 

s-^-W  -  vv]- 

A  clarification  of  notation  yields  the  result.   From  our  definition 
of  h_T(*),  !(,(•),  and  h  (•),  we  have 

and 

=  r  (6  )  . 
— o  -to 

Thus,  /T  H,.^-^)  =Vf  [r^C^)  -  r^)]  and  the  result  follows  from  R4, 

R6:   As  T  ■*■   °°, 

/T  Ml1   HpCip-4  )  -2-*N   (0,(M~!R   )    E    (M_1R   )'). 
*■        l      L  -*x>  q  —       o      o  oo 

Proof  of   R6: 

Using  standard  results  for  normal  distributions,  the  result 
follows  when  C6,  R3,  and  R5  are  applied  to  Lemma  3.7. 
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It  should  be  noted  that  for  any  finite  T,  M"1  may  not  exist. 
However,  the  following  result  indicates  that  this  fact  has  no  effect  on 
the  asymptotic  distribution  of  /f(j>_  -j>). 

R7;   Let  AT=  iMjl  and  CT  =  {w:  AT(w)  i   0}.   Then  P(C  )+l  as  T+°o. 

Proof  of  R7: 

It  follows  from  R3  and  standard  results  that  A  P   V  |m  I  as  T+°° 

T       '  o1 

Since  |mJ  f   0  (from  C6) ,  we  have  that  P(C  )  ■*  1  as  T  -*■  «. 

The  implication  of  R7  is  that  the  behavior  of  /f  (jj>  -4  ) 
need  only  be  considered  on  CT  for  purposes  of  determining  the  asymptotic 
distribution.   Thus,  we  can  conclude  from  R6  that 

^CfeidL)  —*-N    (0,(M_1R  )  E  (M_1R  )')   as   T  -»■  ». 
1  ^o        q  —   o  o      o   o 

With  the  completion  of  the  proof  of  this  general  result,  it  is 

now  necessary  to  verify  that  the  conditions  hold  for  our  specific  case. 

We  have  6^,  =  ^   and  ^  =  6q,  where  £f  and  J^  are  defined  in  (3.1.1)  and 

(3.1.2),  respectively,  A  =  (a   b  ,n      )',   &      =  (a  ,b  ,a  ) ', 

1  11        11        11  -"-()  o       o       o 


S  -  G  ®T    '(0),    k  =  n2,    and   q   =   3, 


Let 


f1(e,i) 


n 

£      B    . 
1-1      1 
n 


-   a, 


and 


n       n 
T.        Z      B 
1=1  J-l      « 

f    (9fiy    =  Jii _  b> 


n       n 


f3(9,4)   =     l       I   [B     -bv     (a)]   v     (a)    fd..-    E     d.,v.,(a)] 
1=1  j=l     1J       1J  1J  XJ     k=l     lk  lk 
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where  v.. (a)  is  given  in  (2.3.2).   Recall  from  A5  that  B  =  B   . 
J-J  o    ro 

It  follows  then  that 


n 

n 

ho,l^ 

E 
i=l 

B                             £      a 
ro,n                           o 

a  =  ■ 

n                                    n 

=  a     -  a, 

0 

-   a 

n 

n                                    n 

n 

i=l 

E      B                               E 
J-l      r°'1J                  1-1 

E     b  v.. (a  ) 

•     1        0    1J       o 

ho,2^ 

1*1                        , 

i/i 

D     — 

n 

n 

n 

E 
1=1 

n 

b 
o 

b 

=  b      - 

0 

b, 

-  b 


and 

n   n  n 

h   (±)   =  E   E   [B   ..-bv..(a)]  v.. (a)  [d.  .  -  E  d..v.,(a)] 
°'3      i=l  j=l     ,XJ   1J      1J      1J   k=l 

n   n  n 

=  E   E   [b  v.. (a  )  -bv.  .(a)]  v. .(a)  [d.  .  -  E  d.,v.,(a)]. 
i=1  j=1   o   ij   o     ij      ij      ij   k=1  xk  ik 

j*i 

Recalling  the  form  of  a,^  ,  b   ,  and  a    from  2. 3. A,  it  follows  that 

n 

£   B„,  .. 
i=l  T'ix 

hT,l(^  =— n a=  aTl  "  a' 

n   n 

1-1  j=l  T'1J 

hT>2(i)  = ^ b=bTl"b' 

and 

n   n  n 

h   (±)   =  E   E   [B    -bv..(a)]  v..(a)  [d..-  E  d.v  (a)]. 
T,3  -1-     .  1  .      T,ij    ij       ij       ij   i  1   ik  ik 
1=1  j  =  l     •  j     j        j        j   k-i 

j*i 
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We  also  have: 


and 


and 


n 

0,1  — 

= 

E      B.  . 
i=l      " 

n          "   V 

n        n 
E        SB.. 
1-1   J-l    1J 

rn    9<9> 
0,2    - 

= 

1*L                . 

bo' 

n   n  n 

r   (0)  =  E   E  [B..-b  v.. (a  )]v..(a  )[d..  -  I     d,v.,(a)]. 
o,3  -    i=1      ij   o  ij   o   ij   o   lj   k=1  ik  Ik  o 

We  now  check  the  conditions. 

Cl:    Lemma  3.2  meets  this  condition. 


C2:   Evaluating  h  (^  ) ,  we  have: 

h   ,  (d>  )  =  a  -a   =0, 

0,1^     o    o 

h   o(A)  "  b   ~  b   =  0, 

o,2  -^t)     o    o 


n   n 


h  _(*  )  =  I   E  [b  v..  (a  )-b  v..  (a  )]v..(a  )[d..  -  E  d.,v.,  (a  )] 
o,3  -"-o    i=1      o  ij   o   o  ij   o   ij   o   ij      ik  ik  o 

j*i 

=  0. 
Evaluating  h  (jju) ,  we  have: 

hT,i(V  =  aTi  "  an  =  °' 

hT,2(V  =  bTl  "  bTl  =  °' 


and 


=  0, 
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by  the  definition  of  a   in  (2.3.12).   Since  h_0£  )  =  h  (*  ) ,  the 
condition  is  established. 

C3:   The  results  from  3.3.2  satisfy  this  condition. 

C4:   Evaluation  of  the  partial  first  derivatives  yields: 

hT,H(i)    =  hT,22(^    =   -1' 

hT,12(^    "  hT,13^    =  hT,21(^    "  hT,23(^    =  \,31(&    =   °' 

n       n  n 

hT  „(£)  -  -     S       E     v.2.(«)[d..   -     Ed.   v.,  (a)], 
T'32  1=1  j-l     1J  1J       k-1  lk  lk 


and 


n       n  r  n 

h       M   =     E       E  v..  (a)  (td,    -    E  d.,  v.,  (a)  f  [2bv.  .  (a)-B_   ..] 
T>33  ±=1        j   ij  ^     ij      k=1  ik  ik  ij  T,ij 

+  [BTjlibvlj(a)]{  jidikvik(a)[dik- ^duvu(a)]}J  . 

Since  v  .(•)  is  a  continuous  function  of  a  for  all  i  and  j,  it  follows 
iJ 

that  h„,  ..(•)  exists  and  is  continuous  everywhere  for  all  i  and  i. 
T,iJ  •  J 

C5:   The  forms  of  h  ((b)    and  h^Ccb)  imply  that  h   .  .(<b)    -   hm  .  .(A) 
—  -o  ■*■  -T  ^  o,ij  *     T,lj  ■* 

for  all  i  and  j  except  for  i  =  j  =  3,  where  we  have, 

n   n        f  n 

h    (£)   =  E   E  v..  (a)  <[d.  .  -  E  d.,  v.,  (a)]2  [2bv..(a)-b  v..  (a  )] 
o,33-    ±=1     Ij    ^  ij  k=1  ik  ik        ij     o  ij  o 

n  n  -\ 

+  [b  v.. (a  )-bv..(a))(  E  d  v.,(a)[d..  -  E  d     v     (a)  ] )  }    , 
o  ij  o    ij     ^t,_i  Ik  lk     ik  g    1  vi   ix    J  I 

For  the  same  reason  as  in  C4,  we  can  conclude  that  h   ..(•)  exists  and  is 

o,ij 

continuous  everywhere  for  all  i  and  j . 


where 


and 


C6:   By  definition, 

M  =  {h   ..(*)} 

o      0,1]  V 


-10        0 

0-1        0 

Lo     h    (A  )  h    U   ) 
o,  32  -^o   o,33  -^o 

n   n 


>>.  m^  "  "  Z   E  v-.  (a  )  [d..  -  Ed.,vM(a)] 


n   n 
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(3.3.29) 


ho  33(4)  =  bn  E   E  v,Va)[d..-  E  d. .v..  (a  )]  . 
o,JJ  -o     o  i=1      ij   o    ij       ik  ik  o 

Therefore  |mJ  =  hQ  33^)  •   Since  bQ  ^  0  (A13)  ,  it  is  enough  to  show 

n 

there  exists  i  and  j  such  that  d..  -   E   d.,  v.,  (a  )  /  0.   From  A4, 

ij    ,  ,   ik  ik  o  ' 

k=l 

it  follows  that  there  exists  a  location  i  for  which  there  exists 

neighbors  j  and  I   such  that  d..  ±   d.„.   Choose  location  i  to  be  a 

IJ      IX, 

closest  neighbor  of  location  i.   Now,  from  A3,  it  follows  that 


d   -  E  d.,  v  (a  )  =  Ed.  .v..  (a  )  -  E  d  ,  v  ,  (a  ) 
k=1  ik  ik  o        ij  ik  o        ik  ikv  oJ 


ij 


=  E  (d..-d..  )vM  (a  ). 
k=1  ij  ik  ik  o 

Now  d   <.   d   for  all  k  ^  i  and  d.  .  <  d.„.   Therefore,  since  v   (a  )  >  0 
for  all  i  and  j  f   i  (from  A14  and  (2.3.2)),  it  follows  that 

n  n 

S  (d^~d,-i,)v-i  (a  )  <  0  which  implies  that  d..-  E  d  .  v   (a  )  <  0. 
k-2   !J   ik   IK   O  Ij       ik  ik   o 

Thus,  the  condition  is  satisfied. 
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£7:   Since  the  only  (i,j)  combination  for  which 

hT)ij(D  ?   h0>ij(i)  is  (3,3),  it  is  enough  to  show  that 

sup  |hTj33(4)  -  hQ  33(^)  |  -£-*.()   as   T  +   ~. 

First  note   that   there  exist   c  and  d,    both   finite,    for  which 
n  n 

'^   dikVik(a)'    *   c  and    ldij    -  k^dikvik^l    *   d> 
since,    from  Al,    it   follows    that 


11  ii 

k=1     lk  ik  k=1     ik  ik 


n 


<     E     d., 
k=l     lk 


<  c 


and 


Kj    ~   ^    dikVikM    "<   dij    +  j2    dikVik(a) 

<  d. .  +  c 

<  d. 

Now,  from  Al  and  the  above  results,  it  follows  that 

,hT,33(^  -ho,33(*}l 

n   n        /        n 

2 


£  v.  .(a)  {  [d.  .  -  Z   d.,  v.,  (a)] 
j-l 


i=L  j-i  ij    I  «    k=l  ik  ik 


-(j>Vik(a)[dik-|ldi£Vi£<«>0} 


fb  v  (a  )  -  b_  4  J  | 

o  ij   o     T,ij" 


n   n 


1=1  j=i      j  '  j 

J* 


Therefore, 


sup      |hTj33(i)   -h0)33(i)| 
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n        n 

"i-1   jfJ^W    "    BT,ij|(d2+Cd)- 
j*i 

Applying  Lemma  3.1  and  A5  (Bq=B   )  leads  to  the  conclusion  that 


sup|hT>33(1)    -   h^Oj,) 


->  0   as   T  ■>  oo. 


C8:   Evaluation  of  the  partial  first  derivatives  yields; 


9r      .(6) 

9B.  . 


if     J 


otherwise. 


3r     .(9) 
Q.2  -     _ 


3B. 


iJ 


if     JM 


0  otherw 


lse, 


and 


8r     ,(9) 


'vij(ao)tdirkf1dikvik(oto)]  =  rij  if  j*1 


3B. 


ij 


0     otherwise. 


Since  all  of  these  derivatives  are  constants,  it  follows  that  C8  is 
satisfied.   Then 


n 


R   = 
o 


i 

n 

0    ... 

0 

/ — 

n 

1 

n 

o    ... 

0    .  .  . 

0      .. 

0 

iv 

n 

0 

1 

n 

1 

n 

l 

n 

0 

1 

n 

1 

n 

1 
n 

1 
n 

0 

• 

0 

r12... 

rin 

r21 

0 

r23-'- 

r2n'-- 

rnl    - 

r 
n 

n- 

-1 

0  . 

(3.3. 

30) 
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All  of  the  conditions  for  Theorem  3.8  have  been  met  without  the 
need  for  any  additional  assumptions.   We  finally  have  that 

vT"[(aT1,bT1,aT1)  -  (ao,bQ,  0^)  ]  v^CO.Z  )   as  T ..-►  ~, 

where 

E-  =  (M_1R  )  £  (M_1R  )', 
1     o   o      o   o 

M  is  given  in  (3.3.29), 

R   is  given  in  (3.3.30),  and 

E  =  G©  r_1(0).  (3.3.31) 

The  asymptotic  univariate  distributions  of  a  . ,  b   ,  and  aT1 
follow  directly  from  the  joint  result.   One  might  fear  that  since  the 
assumption,  A13:  b   ^  0,  was  made  to  satisfy  C3  and  C6,  there  may  be 
some  problem  in  a  derivation  of  the  asymptotic  univariate  distributions 
from  (3.3.31).   However,  recall  that  A13  was  made  only  in  order  to 
arrive  at  the  consistency  of  a   .   For  this  reason  and  also  since  b   is 
explicitly  defined,  A13  enters  into  the  univariate  considerations  only 
in  the  case  of  a   .   Consequently,  one  should  consider  the  asymptotic 
distribution  of  a   only  if  one  is  willing  to  assume  b  ^  0.    In  light 
of  the  earlier  discussion  on  the  justification  of  A13,  it  is  seen  that 
this  restriction  is  quite  reasonable.   For  similar  reasons,  A14  is  also 
necessary  only  in  the  case  of  a   . 

3.4  Review  of  Assumptions  Introduced  in  Chapter  III 

The  assumptions  introduced  in  Chapter  III  are  summarized  in 
Table  3.1. 


Table  3.1 
Assumptions  Introduced  in  Chapter  III 
Section Assumption 


3.3.1     A12;   The  usual  YW  estimator,  B  ,  is  not  such  that 

b T-  —     for  all  i  and  j  e   N. 


Tic 


B„ 


for  all  i  and  i  e  Q. 


or 


B.  1 


bT1  —  for  all  i  and  i  E  F. 

Tl  f.  i 


T,ij 


for  all  i  and  i  e  P. 
J     1 


3.3.2     A13:   The  true  value  of  b,  b  ,  is  nonzero. 


o 


3.3.2     A14:   The  true  value  of  a,  a  ,  is  finite, 
o 
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CHAPTER  IV 


ESTIMATORS  OF  COVARIANCE  MATRICES 
AND  THEIR  PROPERTIES 


4.0  Preamble 
In  this  chapter  we  will  consider  the  estimation  of  two  covariance 
matrices,  G  and  r(0),  where  G  is  the  covariance  matrix  of  the  error  term, 
£  ,  and  T(0)  is  the  covariance  matrix  of  ^  .   As  for  the  other  parameters 
of  interest,  estimation  schemes  will  be  introduced  which  exploit  the 
nature  of  the  spatial  first-order  autoregressive  model.   The  estimators 
and  some  of  their  properties  in  the  case  of  the  general  first-order  auto- 
regressive model  will  be  presented  in  Section  4.1.   This  will  serve  as 
motivation  for  the  specialized  estimation  schemes  which  will  be  intro- 
duced in  Section  4.2.   Some  properties  of  these  new  estimators  will  be 
examined  in  Section  4.3.   The  known  weights  and  variable  weights  cases 
are  considered  simultaneously,  with  differences  noted  when  necessary. 

4.1  Results  for  the  General  First-Order 
Autoregressive  Multivariate  Model 

4.1.1  Relationships  Among  Model  Parameters 

Just  as  there  are  relationships  among  the  model  parameters  which 
lead  to  the  usual  YW  estimator  of  B  in  Section  2.1,  there  are  relation- 
ships which  will  lead  to  the  usual  YW  estimator  of  G.   We  recall  from 
Section  2.1  that  T(0)  is  already  estimated  by  a  moment  estimator,  V    (0) . 


64 


65 


Consider  again  the  model  for  the  general  first-order  auto- 
regressive  time  series, 

Zt-BZ^i+V  (4.1.1) 

for  which  we  assume  A6  and  A7.   That  is,  (A6)  all  roots  of 
f(z)  =  |l-Bz|  =  0  lie  outside  the  unit  circle  and  (A7)  the  £  's  are 
independently  and  identically  distributed  with  the  mean  equal  to  0  and 
the  variance-covariance  matrix  equal  to  G. 

If  we  multiply  through  (4.1.1)  by  e  '  and  take  expectations, 
we  have  from  (2.1.2),  A6  and  A7, 

E(zte  ')  =  G    for  all  t.  (A. 1.2) 

If  we  then  multiply  through  (4.1.1)  by  y_  ',  it  follows  from  (2.1.3) 
and  (4.1.2)  that 

T(0)  =  Br(l)  +  G.  (4.1.3) 

Another  implication  of  A6  and  A7  is  the  relationship, 

00 

r(0)  =  l   BjGB'j,  (4.1.4) 

j-0 

o 
where  B  ■  I,   We  recall  from  Section  2.1  that 

T(-l)  =  Br(0).  (4.1.5) 

We  now  have  a  system  of  3  equations  involving  the  model 

parameters,  B,  T(0),  T(-l),  and  G.   (From  the  definition  of  T(«)  in 

Section  2.1,  it  follows  that  F(l)  =  r'(-l).)  The  results  presented 

thus  far  in  this  section  are  known  and  can  be  found  in  Hannan  (1970:13-15, 

326-329)  and  Fuller  (1976:72-73). 

We  will  show  that  (4.1.4)  and  (4.1.5)  imply  (4.1.3).   Since 

r'(-l)  =  r(l)  implies  that  F(l)  =  r(0)B',   it  follows  that 
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BFQ)  +  G  =  Br(0)B'  +  G 


=  B[  E   BJ  G  B'J]B'  +  G 
j  =  0 


=  [  I      Bj+1  G  B'j+1]  +  G 


j=0 


=  I   BJ  G  B'J 


=  r(o), 

and  the  result  is  established. 

4.1.2   Results  for  the  Usual  Yule-Walker  Estimators 

The  usual  YW  estimator  of  G,  G  ,  is  found  by  using  the 
relationship  in  (4.1.3)  and  letting 

GT  =  rT(0)  -  BTrT(l),  (4.1.6) 

where  ^(0),  1^(1)  (=  ^'(-1)),  and  BT  are  defined  in  Section  2.1. 

The  following  results  from  Hannan  (1970:209-210,329)  will  be 
useful  in  determining  a  property  of  the  estimators  developed  in 
Section  4.2. 

Lemma  4.1: 

If  v_fc  is  generated  as  in  (4.1.1)  and  A6  and  A7  hold,  then 

P 
Gj   1 .  >  G.  .   as   T  ->  o°   for  all  i  and  j  , 

where  G     and  G   are  the  (i,j)  elements  of  G^  and  G,  respectively. 
1 » XJ  i  j  i 

Lemma  4.2: 

Under  the  same  conditions  as  in  Lemma  4.1, 
p 
rT>ij(-k)  — >rij^"k^  as  T  +  °°   for  a11  i   and  J'  and  k=o,i. 
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By  the  nature  of  the  estimation  procedures  for  the  usual  YW 
estimators,  (4.1.3)  and  (4.1.5)  are  satisfied  for  B   T  (0) ,  T  (-1), 
and  GT.   It  will  now  be  shown  that  (4.1.4)  also  holds.   Let 

k 

S,    =     Z     bJ   G^  B    'j 
k         .    _      T        T      T 

k 

=    i    BTJ[rT(o)  -  BTrT(o)BT']  bt'J 

k 

=    z  [b  Jr(0)B  ^  _  btj+1  rT(o)BT'j+1] 

j=0  l  11 

=  rT(o)  -  BTk+1  rT(o)BT'k+1  . 

We  know  that  BT  satisfies  A6  and  that  T  (0)  is  nonsingular  with 
probability  one.   (See  Hannan  (1970:329,332).)   (Of  course,  T  (0)  then 
must  be  positive  definite  with  probability  one.)   If  we  consider  a  new 
first-order  autoregressive  process  for  which  B  is  replaced  by  B   and  G 
by  r  (0)  ,  it  follows  that  the  sum  in  (4.1.4)  must  converge  for  that 

oo 

process.   That  is,  I     B  3    V    (0)  B  _'J  is  a  positive  definite  matrix 

j=0  T   T     T 

with  probability  one.   In  order  for  this  matrix  sum  to  converge,  the 
contribution  of  BT    1^(0)  BT '     to  this  sum  must  tend  to  zero  as 
k  •*■  °°.   For  this  reason,  one  can  conclude  that  S,  .  ■+  T    (0)  as  k  ■*■  ». 

K.      J. 


Thus, 


r  (0)  =  E  btj  G  bt'j 

j  =  0 


4.2  The  Yule-Walker  ill   and  111   Covariance  Estimators 
Our  objective  is  to  develop  estimators  of  the  covariance  terms 
in  our  model  which  reflect  the  spatial  nature  of  the  model.   A  natural 
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modification  would  be  to  estimate  G  using  the  relationship  in  (4.1.3) 
and  letting 

gti  =  rT(o)  -  BrTrT(i)  ,  (4.2.1) 

where  B^  is  a  general  notation  for  the  estimator  of  B  using  either 
the  YW//1  or  YW  #2  estimators  in  the  known  weights  case  or  the  YW//1 
estimators  in  the  variable  weights  case.   In  general, 

BrT,ii  =  aT    f°r  a11  *■' 
In  the  known  weights  case, 

BrTjlj  "  bTw±j  for  all  i  and  j  ^  i, 

and  in  the  variable  weights  case, 

BT  * A    "  b  v   (a  )     for  all  i  and  j  f   i. 

For  specific  estimation  schemes,  B   is  replaced  by  B   in  the  YW#1 
case  and  by  BT2  in  the  YW//2  case.   We  will  use  the  general  notation 
whenever  possible  in  our  discussion  in  this  chapter  and  consider  the 
specific  cases  only  when  necessary. 

An  undesirable  property  of  G   is  apparent  upon  replacing  F  (1) 
with  rT(0)BT'  in  (4.2.1).   That  is,  we  have 

GTI  =  yo)  -  BrTrT(0)BT'. 

Since,  in  general,  B   i   B  ,  it  follows  that  G   is  not  symmetric. 
A  modification  which  corrects  this  problem  would  be  to  use  G   as  an 


estimator  of  G,  where 


GrT  =  2(GTI  +  GTI}-  (4-2.2) 


As  in  the  case  of  B   ,  G   will  be  the  general  notation  for  the  esti- 
mator and  G   and  G   will  denote  the  YW//1  and  YW//2  estimators  of  G, 
respectively. 
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It  follows  from  the  work  in  3.2.3  and  3.3.3  that  r-1(0) 
is  a  component  which  enters  into  the  calculation  of  the  asymptotic 
covariance  matrices  of  the  estimators  of  (a,b)  or  (a,b,a).   At  this 
stage,  we  have  only  the  moment  estimator  of  T(0),  F  (0) .   It  is  desir- 
able to  develop  another  estimator  which  takes  into  account  the  special 
structure  of  our  model.   One  criterion  would  be  to  develop  an  estimator, 
r  T(0)  in  general  notation,  which  fits  into  the  framework  of  the  three 
relationships  given  by  (4.1.3),  (4.1.4),  and  (4.1.5).   Using  (4.1.4) 


oo 


r  ^(0)  =   r   B   J  G    B   'J  .  (4  2  3) 

rTv     ,  __  rT   rT  rT  IH.-c.jj 

It  is  not  clear  that  the  right-hand  side  converges  because  it  is 
not  known  that  B   satisfies  A6,  nor  is  it  known  what  effect  G   might 
have.   (This  is  an  area  for  additional  research.)   Two  suggestions  are 
made  for  practical  usage  of  (4.2.3): 

(i)   Calculate  terms  in  the  sum,  (4.2.3),  until  convergence, 
according  to  a  specified  criterion,  is  established.   In 
our  work,  a  correlation  matrix  was  calculated  after  each 
step  in  the  summation.   When  the  absolute  change  in  the 
correlations  from  one  step  to  the  next  was  arbitrarily  small 

for  all  elements  in  the  matrix,  convergence  was  assumed. 

L     i         1 
(ii)   Calculate   E   B   J  G   B   '  ,  where  L  is  a  preassigned  limit. 
j=0   rr   rT   rT 

The  choice  of  L  would  probably  depend  on  when  one  would  expect 
convergence  to  occur  if  T(0)  were  calculated  by  using  (4.1.4). 
Combinations  of  (i)  and  (ii)  could  be  used.   For  example,  (i)  could  be 
used,  with  (ii)  as  a  default  if  convergence  does  not  occur  before  L  steps. 


gives 
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Empirical  investigations  like  those  presented  in  Chapter  VI  should 
provide  more  insight  to  the  practical  considerations  of  this  problem 
Using  (i)  and/or  (ii)  provides  an  estimator  that  is  modified 
according  to  the  special  structure  of  our  model.   One  might  then  sug- 
gest using  1^(0)  and  B^  in  (4.1.5)  to  get  a  modified  estimator  of 

r(-D,  rrT(-D>  which  in  turn  would  be  used  along  with  B   and  V      (0) 

rT      rT 
in  (4.1.3)  to  modify  G^.   However,  since  (4.1.4)  and  (4.1.5)  imply 

(A. 1.3),  one,  in  theory,  would  not  get  any  modification  of  G 

rT' 
This  would  be  the  case  if  the  sum  in  (4.2.3)  did  converge  and  it  were 

possible  to  calculate  all  terms  in  the  sum.   However,  only  a  finite 

number  of  these  terms  will  be  calculated  in  order  to  determine  T      (0) 

It  would  seem  that  using  only  a  finite  number  of  terms  would  not  have 

a  major  modifying  effect  on  G^  if  the  stopping  rule  (in  summing  terms 

in  (4.2.3))  were  reasonable.   If  the  modification  was  not  manor  B 

'   rT' 

GrT'  rrT(0)'  and  rrT(1)  could  be  regarded,  for  practical  purposes,  as 
satisfying  (4.1.3),  (4.1.4),  and  (4.1.5). 

The  covariance  estimation  procedures  discussed  in  this  section 
can  be  summarized  in  three  steps. 

Step^:   Estimate  Br  using  (a^.b^  or  (a^.b^c^) 

to  calculate  B^  or  (a^.b^)  to  calculate  B   . 
Step_^:   Estimate  G  with  G^  or  G^  using  (4.2.1)  and  (4.2.2). 
gtep  3:   Calculate  a  modified  estimate  of  T(0),  T      (0)  or 
rT2(0),by  using  (4.2.3)  and  following  (i) 
and/or  (ii) . 


71 


4.3  Consistency  of  the  Yule-Walker  //l  and  ill 
Covariance  Estimators 

In  order  to  show  the  consistency  (in  probability)  of  G   ,  we  will 

show  the  consistency  of  each  of  its  components.   The  consistency  of  G 

'     rT 

will  then  follow  from  standard  results.   In  the  variable  weights  case,  we 
have  from  3.3.2  that  a^,  t>T1,  and  o^  are  all  consistent  (in  probability) 
It  then  follows  that 

BTl,ii  =  ^l-^o  =  Bro,ii  as  T  ■*  °°  for  a11  i  . 
and 

BT1  ii  =  ^i^^Ki)  *"b  v..  (a  )  =  B 

i-L,iJ     li  ij   Tl        o  ij   o     ro,ij 

as   T  ->-  oo   for   all  i  and  j  /  i, 

since  v (a)  is  continuous.   Therefore,  in  the  variable  weights  case, 

B   is  a  consistent  estimator,  element-wise,  of  B   . 
x  ro 

From  our  earlier  work  in  the  known  weights  case,  it  is  enough  to 
consider  only  the  YW//2  estimators  of  (a,b).   We  have  from  3.2.2  that  both 
aT2  and  bT2  are  consistent.   It  follows  then  that 

BT2,ii  =  aT2  ~^a0  =  Bro,ii   as  T  *  °°  for  a11  i 
and 

BT2,ij  =  bT2Wij  ~~T>boWij  =  Bro,ij  as  T  +   °°  for  a11  *  and  J*1' 

Therefore  in  the  known  weights  case,  both  B   and  B   are  consistent 

estimators,  element-wise,  of  B   . 

ro 

Since  this  covers  all  of  our  estimation  procedures,  we  can  say, 

in  general,  that 

P 
BrT  4  *   *  B      as   T  ■*■   oo   for  all  i  and  j  . 

That  is,  B   is  a  consistent  estimator,  element-wise,  of  B   . 
ri  ro 
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From  Lemma  4.2,  we  have  that  both  1^(0)  and  1^(1)  are  element-wise 
consistent  estimators  of  T(0)  and  I'd),  respectively.   Using  this  result 
and  the  consistency  of  B   ,  we  have  by  standard  results, 


GTI,ij     >  Gij     as   T  ^  oo   for  all  ±   and  j .   It  then 


follows  from  (4.2.2)  that 

p 
GrT,ij  >Gij     as   T  ■*■   °°   for  all  i  and  j. 

Thus,  GrT  is  an  element-wise  consistent  and  symmetric  estimator  of  G. 
This   result  is  analogous  to  Lemma  4.1. 

In  order  to  determine  whether  or  not  1^(0)  is  a  consistent 
estimator  of  T (0) ,  one  must  specify  a  stopping  rule  for  the  sum  in  (4.2.3) 
If  (ii)  is  followed,  let 

L 

r.  (0)  =  l      B  3   G   B'  j 
L      .  n   ro     ro 
j=0 


rrT,ij(0)  =  I  *      BrTJ  GrT  b;tJ).   ^>rL(0)   as   T  ->  »   for  all  ±  and  j , 


The  results  of  this  section  imply  that 
L 
vj  =  0 

That  is,  TrT(0)  is  an  element-wise  consistent  estimator  of  F  (0)  which  is 
a  good  approximation  to  T(0).   Of  course  from  Lemma  4.2,  T    (0),  itself, 
is  a  consistent  estimator  of  T(0),  but  it  is  hoped  that  T      (0)  would 
perform  better  than  1^(0)  for  finite  T  because  the  specific  structure  of 
our  model  is  taken  into  account  in  the  former. 

If  a  stopping  rule  like  (i)  were  used,  the  study  of  the 
consistency  of  ^(0)  would  be  more  difficult  because  the  number  of 
terms  included  in  the  sum  would  be  a  random  variable.   Consistency  of 
the  estimator  in  this  situation  was  not  extensively  studied. 
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Estimation  of  the  covariance  matrices  of  the  asymptotic 
distributions  derived  in  Chapter  III  requires  an  estimator  of  r_1(0). 
Thus,  a  desirable  property  for  an  estimator  of  T(0)  is  nonsingularity. 
The  question  of  the  nonsingularity  of  r   (0)  has  only  been  empirically 
considered.   For  the  results  reported  in  Chapter  VI,  T      (0)  was  non- 
singular  in  all  cases. 


CHAPTER  V 


INFERENCE 


5.0  Preamble 


The  results  in  Chapters  II  through  IV  represent  the  foundation 
for  the  inferential  procedures  presented  in  this  chapter.   In  Section  5.1 
asymptotic  single-parameter  test  statistics  and  confidence  intervals  for 
a,  b,  and  a  (if  appropriate)  will  be  presented.   Joint  confidence  inter- 
vals and  tests  will  he  considered  in  Section  5.2.   Prediction  with  the 
general  first-order  autoregressive  model  will  be  discussed  in  Section  5.3, 
and  these  results  will  be  applied  to  the  spatial  model  in  Section  5. A. 

5.1  Asymptotic  Single-Parameter  Hypothesis  Tests 
and  Confidence  Intervals 

5.1.1  The  Known  Weights  Case 

One  of  the  advantages  of  the  parameterization  for  our  special 

first-order  autoregressive  model  is  that  it  allows  for  exploratory 
study  of  the  underlying  process.   In  the  known  weights  case,  there  are 
two  effects  to  be  studied,  the  location  effect,  represented  by  the 
parameter  a,  and  the  neighbor  effect,  represented  by  the  parameter  b. 
To  perform  hypothesis  tests  and  construct  confidence  intervals,  we  use 
the  results  from  3.2.3,  where  it  was  shown  that 

^[(aT,bT)-(ao,bo)]-^>|2  (0,iaHr')   as   T  +  «,,      (5.1.1) 

where  Z   =   G(S)r_1(0)  and  Hr  is  either  Hx  (YW//1)  or  H2  (YW//2)  given 
in  3.2.3. 
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Let  a2  =  (H  E  H  ')„   and  0*  =  (H  £  H  ')„„. 
a     r    r  11        b     r    r  22 

The  results  presented  here  can  be  applied  to  both  the  YW//1  and 
YW//2  estimators.  Consequently,  a  general  notation  without  subscripts 
"1"  and  "2"  will  be  used. 

The  asymptotic  univariate  distributions  follow  directly  from 

the  asmptotic  joint  distributions.   Thus,  for  large  values  of  T,  both 

aT-a  bT-b 

z   =  and   z,  =  

a   a  //T       b   a,  I  Jr. 
a  b 

can  be  regarded  as  being  approximate  standard  normal  random  variables. 

In  order  to  test  the  hypothesis,  H   :  a  =  a  ,  the  usual  z-test  would  be 

o        o 

used  with  the  test  statistic,  z   .   If  a   =  0,  this  would  provide  a 

a0       o 

test  of  the  hypothesis  of  no  locational  effect.   That  is,  we  test  that 

the  response  at  a  location  at  time  t  is  not  explicitly  related  to  the 

response  at  that  location  at  time  (t-1). 

In  the  same  way,  to  test  the  hypothesis,  H   :  b  =  b  ,  one  could 

o        o 

use  the  test  statistic,  z.  .   If  b   =0,  this  would  provide  a  test  of 

b0       o 

the  hypothesis  of  no  overall  neighbor  effect  in  the  sense  that  the 
response  at  a  location  at  time  t  is  not  explicitly  related  to  the 
response  at  any  of  the  other  locations  at  time  (t-1).   Note  that  if 
b  „  were  used,  the  test  would  represent  a  test  of  an  overall  neighbor 
effect  with  regard  to  a  specific  weight  structure.   However,  if  b 
were  used,  one  considers  the  specific  weight  structure  only  through  £ 
and  the  assumption  that  the  weights  are  scaled  to  add  to  unity  for  each 
location.   (The  weights  determine  B  which,  along  with  G,  determines 
T(0)  and  hence,  £.)   While  for  theoretical  considerations,  the  YW//1 
estimator  can  be  regarded  as  fitting  within  the  general  framework  of 
the  YW#2  estimation  scheme,  it  is  seen  that  in  application,  at  least 
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aT  *   ZY/2  aa//T' 
where   z   /2   is   such   that  P(z  >   z    .   )   =  y/2   and   z   -  A?  (0,1). 

Likewise,  a  (1-y)  100%  confidence  interval  for  b  would  be 

bT  *  ZY/2  °b//F  ' 

Upon  observing  the  form  of  0   or  a,  for  either  estimation 

a     b 

scheme,  it  is  clear  that  for  practical  usage  of  these  results,  one 

would  need  to  estimate  O     and  a,    by  a   _  and  a,  „.  respectively.   Since 

a      b     ax      bT  ^ 

H^  and  E^   are  both  constant  matrices,  I  is  the  only  matrix  that  needs 
to  be  estimated.   We  estimate  E  by  replacing  G  and  T(0)  with  their 
consistent  estimators,  GrT  and  rrT(0),  presented  in  Section  4.2. 
It  should  be  noted  that  in  using  O        and  a   as  consistent  estimators  of 
Og  and  Ob,  we  are  taking  into  account  the  specific  weight  structure 
assumed  for  our  model  in  both  the  YW//1  and  YW//2  cases  through  the  use 
of  GrT  and  1^(0). 

5.1.2  The  Variable  Weights  Case 

For  the  variable  weights  case,  the  parameter  a  provides  for  a 
distance  effect  in  addition  to  the  location  and  neighbor  effects. 
Recall  that  in  3.3.3,  we  showed  that 

v/T[(aT1,bT1,aT1)  -  (ao>bQ,ao)]  ~^^3  (0.^)   as  T  ■*■  »,     (5.1.2) 


in  terms  of  calculating  the  estimate  of  b  and  performing  this  hypothesis 
test,  the  YW//1  approach  is  more  general. 

If  one  were  interested  in  estimating  a  and  b,  the  usual  confi- 
dence intervals  could  be  constructed.   A  (1-y)  100%  confidence  interval 
for  a  would  be 
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where  E,  =  (M   *R  )  E  (M  ~'R  )' 

J-      O    O        O    O 

MQ  is  given  in  (3.3.29) , 

Rq  is  given  in  (3.3.30),  and 

£  =  G  ®  r_1(0). 

As  in  5J^1,  the  appropriate  test  statistic  to  use  in  testing 

the  hypothesis,  H   :  a  =  a  ,  would  be 
o        o 

Z     =   T1      ° 

a°   a  J/r 

al 
The  interpretation  of  the  test  when  a  =  0  is  the  same  as  in  5.1.1. 
Similarly,  the  appropriate  test  statistic  for  testing 

H   :  b  =  b  would  be 
o        o 

0     V* 

In  order  to  test  «0   :   a  •   (>o,  more  care  must  be  taken.   We 
showed  in  ^3  that  the  asymptotic  distribution  of  /r  (a  -a  )  exists 

ll   o 
only  if   b  i   0.   Consequently,  the  following  hypothesis  test  can  be 

performed  only  if  assumption  A13  (b  *))  holds.   In  that  case,  the 

appropriate  test  statistic  would  be 

°    °al//T 
If  aQ  =  0,  this  would  provide  a  test  of  the  hypothesis  of  no  distance 
effect  among  the  neighbors.   Under  this  hypothesis,  the  effect  of  a 
neighbor  on  a  location  does  not  depend  on  the  distance  between  the  loca- 
tion and  the  neighbor,  according  to  our  specific  weight  structure, 
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because  v(0)  =  — j-  for  all  i  and  j  ^  i.   Because  A13  must  be  assumed 
for  the  validity  of  this  test,  it  follows  that  one  should  test  first 
for  a  neighbor  effect  and  then  for  a  distance  effect  among  the  neighbors. 
We  can  construct  (1-y)  100%  confidence  intervals  for  a,  b,  and  a 
as  follows: 

!al 
"Tl  ~  *Y/2  fi   ' 

h   +     ffel 
DT1  ~  ZY/2  £   ' 

and 

a   ±  z    ^1 
Tl    ZY/2  y- 

For  practical  usage  of  these  results,  we  would  need  to  estimate 
°al'  °bl'  and  aal*   In  addition  to  estimating  G  and  T(0)   with  G   and 
rrT(°).  respectively,  M  and  R  need  to  be  estimated.   The  forms  of 
Mo  and  Ro  in  (3-3-29)  and  (3.3.30),  imply  that  they  can  be  estimated 
consistently  using  bT1  and  a        to  calculate  ML   and  R   .   Since  b   4   0 
(All),  M     exists  for  the  same  reason  that  M  1   exists.   However, 
assumption  of  All  is  necessary  only  if  the  procedures  involving  a 
are  used.   Since  all  the  components  involved  in  the  determination  of 
°aTl'  CTbTl'  and  aaTl  &re  consistent>  it:  follows  that  a     a   ,  and 
aaTl  3re  conslstent  estimators  of  (J   ,  a   ,  and  o   ,  respectively. 

The  results  in  this  section,  as  well  as  the  next,  are  asymptotic. 
The  empirical  investigations  reported  in  Chapter  VI  should  provide  some 
insight  into  the  use  of  the  results  for  finite  sample  sizes. 
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5.2  Asymptotic  Multiparameter  Hypothesis  Tests 
and  Confidence  Regions 

5.2.1  A  General  Result 

The  following  lemma  provides  a  general  result  that  will  be 

useful  in  developing  multiparameter  hypothesis  tests  and  confidence 

regions.   Since  this  is  a  known  result,  it  is  stated  without  proof. 

Lemma  5.1: 

Let  8_  be  an  estimator   of  6  based  on  T  observations,  with 
— r  — o 

both  9  and  6  of  length  k.   If  /f (6,  -0  )  -^-> N AO,Z)    as  T  ■+  °°, 

then  if  £   exists, 

T(9-e  )'  Z_1(9T-6  )  ^->X2(k)   as  T  ■*  °°, 
— 1  — o       — I  — o 

where  y2 (k)  is  the  central  chi-squared  distribution  with  k  degrees 

of  freedom. 


5.2.2  The  Known  Weights  Case 

By  using  the  result  in  (5.1.1)  and  applying  Lemma  5.1,  one  can 

derive  the  asymptotic  test  of  the  joint  hypothesis,  H   :  a  =  a  and 

b  =  b  .   The  test  statistic  is 
o 

X2=T[(aT,bT)-  (a  ,b  )]  I  _1[(aT,bT)-  (a  ,b  )  ]',       (5.2.1) 
TT     oor     TT     oo 

where  E   =  H  EH  '   H  =  H,  or  H„  depending  on  whether  the  YW//1  esti- 
r    r   r    r    1     2 

mators  or  YW//2  estimators  are  being  used.   The  form  of  the  rejection 

region  for  a  y-level  test  would  be  X2  >   X2 (2) ,  where  X2(k)  is  sucn  that 

Y  y 

PlX2(k)  >  X^OO]  =  Y- 

If  H   is  rejected,  one  might  use  the  single-parameter  procedures 
in  5.1.1  in  an  attempt  to  determine  individual  differences  which  could 

have  led  to  the  rejection  of  H  . 

o 
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For  estimation  purposes,  an  asymptotic  (1-y)  100%  joint  confidence 
ellipsoid  for  (a,b)  could  be  constructed  using  a  technique  like  that 
in  Anderson  (1958:55).   The  ellipsoid  would  consist  of  all  values  of 
(a,b)  for  which 

T[(aT,bT)-(a,b)]  Z^[(aT,bT)-  (a,b)]'<  x*(2).  (5.2.2) 

Since  both  (5.2.1)  and  (5.2.2)  contain  Z     the  question  of 

r 

whether  or  not  E   is  invertible  must  be  answered,  where 
r  ' 

Z   =  H  (G®  r_1(0))H  '. 
r    r  r 

Now  both  G  and  T      (0)  are  invertible,  and  H  (H..  ,H  )  is  clearly  of  rank  2, 

which  implies  that  E   is  of  rank  2  and  hence  that  E    exists. 

r  r 

The  results  of  this  section  still  follow  when,  in  practice,  E   is 
estimated  consistently  fay  E   where 

ErT  =  Hr(GrT®rr>»V- 

The  matrix,  E  _,  will  then  be  invertible  if  both  G  _  and  T   „,(0)  are 
ri  rT      rT 

invertible. 


5.2.3  The  Variable  Weights  Case 

More  care  must  be  taken  in  developing  and  using  multiparameter 
procedures  in  the  variable  weights  case,  because  a   's  behavior  is 
evaluated  only  if   b  4-   0. 

From  (5.1.2)  and  Lemma  5.1,  it  follows  that  to  test  the  joint 

hypothesis,  H   :a=a,b=b   /  0,  and  a  =  a  .  for  large  T,  one  can 
o        o       o  o' 

employ  the  test  statistic, 

X-T[(a   ,b   a  )-(a  b  ,a  )]  F~l  [  (aT1  ,b  T1  ,ol  )-(a  ,b  ,a  )  ]',    (5.2.3) 
11   Tl   Tl     o   o   o     1    Tl   Tl   Tl     o   o   o 

where   E.  =  (M~  R  ) (G  ®  T~l (0)) (M_1R  )'.   The  form  of  the  rejection  region 
loo  o   o 

for  a  y-level  test  would  be  \2  >   X2(3). 
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If  H   :  a  =  a  ,  b  =  b  ±   0,  and  a  =  a  is  rejected,  the  single- 
o        o       o  0 

parameter  procedures  of  5.1.2  could  then  be  used  to  detect  individual 
differences. 

If  b  =0,  one  would  need  to  first  test  H   :  a  =  a  and  b  =  0 
o  o        o 

using  the  YW#1  procedure  given  by  (5.2.1).   If  H   is  rejected,  one 

could  use  the  single-parameter  tests  in  5.1.2  to  detect  significant 

differences  from  the  hypothesized  values.   The  test  of  H   :  a '  =  a 

would  be  carried  out  only  if  one  were  willing  to  assume  b  4   0  (A13) . 

An  asymptotic  (1-Y)  100%  joint  confidence  ellipsoid  for 

(a  ,b  ,a  )  would  be  all  values  of  (a,b,a)  for  which 
o  o  o 

T[(aT1,bT1,aT1)-(a,b,a)]  E~'  [  (aT1,bT1,aT1)-(a,b,ci)  ]  '<  X*(3).      (5. 2. A) 

Any  points  of  the  form  (a,0 ,a)  would  need  to  be  eliminated  from 
the  ellipsoid  since  we  consider  the  joint  distribution  of  (a„.  ,b~,  ,a„-, ) 
only  if  b  ^  0.   Using  (5.2.2),  a  confidence  interval  could  be  con- 
structed for  a  in  the  case  of  b  =  0.   Even  with  the  (a,0,0i)  values 
removed  from  the  ellipsoid,  it  would  seem  that  (5.2.4)  would  be  a  bit 
difficult  to  portray  graphically.   A  better  procedure  might  be  to  gr?ph 
contours  of  (a, a)  for  selected  nonzero  b-values. 

From  (5.2.3)  and  (5.2.4),  it  is  seen  that  2.  must  be  invertible. 

If  M   R  is  of  rank  3,  it  follows  that  E_  is  invertible  since  both 
o   o  1 

G  and  T(0)  are  invertible.   Upon  examining  the  form  of  R  in  (3.3.30), 

it  follows  that  the  rank  of  R  is  3  since  we  assume  that  there  are  at 

o 

least  two  different  distances  (A4) .   Since  M   is  clearly  of  rank  3,  it 

o 

follows  that  M   R   is  of  rank  3. 
o   o 

In  practice,  one  would  use  E  -  ,  a  consistent  estimator  of  E  , 
where   E^  =  (H^K^)  (GT1 '®  1^(0))  (M^RJ' . 
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Since  M  ^IL,  is  of  rank  3,  I       will  be  invertible  if  GTl  and  ^(0) 
are  nonsingular. 

5.3  Prediction  with  the  General  First-Order  Autoregressive 
Multivariate  Time  Series  Model 

5.3.1   Introduction 

One  of  the  major  purposes  in  developing  a  time  series  model  is 

to  use  the  model  to  predict  or  forecast  future  realizations  of  the  series 

Consider  again  the  model  for  the  first-order  autoregressive  multivariate 

time  series, 

y  =  B  y   ,  +  E  ,  (5.3.1) 

*-t  ^t-1   -t 

where  assumptions  A6  and  A7  are  true. 

Suppose  this  process  is  observed  for  T  time  periods,  t=l,2,...,T. 
This  section  deals  with  the  problem  of  predicting  2.T+k>  k  =  l>2>''-> 
that  is,  predicting  k  time  units  ahead. 

We  begin  by  writing  the  model  in  (5.3.1)  for  t  =  T+k  in  terms 
of  the  observations  by  time  T.   We  have 

%+k  =  B  ZT+k-l  +  -T+k 

■  B(B  ZT+k_2  +  fi^i)  +  £T+k 

=  B'  *T+k-2  +  B  ^T+k-1  +  %+k 

=  Bk  v  +  E   BJ  £„„  .  .  (5.3.2) 


It  follows  that 


k 


since  an  implication  of  A6  and  A7  is  that  E   is  independent  of 

y    ,y    ,...,   for  all  t.   Practically,  we  will  be  interested  in  the 
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expected  value  of  v^^  given  only  a  finite  number  of  past  values.   But 
the  Markovian  nature  of  the  autoregressive  model  implies  that 

E(ZT+k  I  ij  «£,,,..  ..Xj  "%)  =  Bk£T  =  E(y_T+k  |  v_T  =^T)  ,  (5.3.3) 

so  that  this  practical  consideration  imposes  no  limitations. 

5.3.2   Prediction  When  B  is  Known  to  be  B 

— o 

From  (5.3.2),  it  would  seem  natural  for  one  to  use  B   v  to 

o  XT 

predict  y_T+k  if  one  wished  to  use  only  a  linear  combination  of  past 

observations  (i.e.,  JZT,XT_r  •  •  •  O^)  •   Call  this  predictor  S      An 

application  of  a  more  general  result  in  Hannan  (1970:127-130,  135-136) 

leads  to  the  conclusion  that  £T+k  =  Bq  y_T  is  the  best  linear  predictor 

of  XT+k  using  the  entire  past,  y^.y,^,...  .   The  predictor,  £_   ,  is 

best  in  the  sense  that  the  minimum  of  E[(y    -£   ) '  (y    -£    )1 

yXjT+k     -T+lt   ^T+k  %+k7  J 

is  at  ^T+k  =  Sjj^t   where  the  minimum  is  taken  over  all  linear  predictors, 
^T+k'  of  ir+k  based  on  the  entire  past.   So 

%+k  =  Bok%  (5-3-4> 

is  the  mean  square  predictor  of  y„,,  . 

For  a  particular  realization  of  the  series,  y_   ,#._,...  ,£  ,  the 
predicted  value  at  time  T+k  would  be 

%+k  =  Bok  &r 

From  (5.3.3),  we  see  that  this  predicted  value  is  such  that 
&r+k  =  E%+k   I  Xr  **  % *!  "  &!>  =  E(yT+k  |  i,.  -  JKj). 

The  error  of  prediction  is  defined  to  be  the  difference  between 
the  actual  value  at  time  T+k  and  the  predicted  value.   One  important 
characteristic  of  these  errors  is  their  variance-covariance  matrix. 
There  are  two  approaches  in  evaluating  this  matrix  that  we  will  consider. 
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One  is  the  conditional  on  the  part  of  the  particular  past  realization 
that  is  used  in  the  prediction,  z^,  and  the  other  is  the  unconditional 
over  all  values  of  jjj.  If  these  yield  different  results,  the  experimenter 
would  then  need  to  decide  which  approach  would  be  appropriate  to  his 
experimental  situation. 

Case  1:   We  consider  the  conditional  approach  first. 

Let  VT(*)  and  E^(')    denote  the  conditional  (on  v  =  y_   )  variance- 

covariance  matrix  and  mean  vector,  respectively,  and  let  V(«)  and  E(*) 

denote  their  unconditional  counterparts.   We  then  have  from  (5.3.2), 

(5.3.4),  and  assumptions  A6  and  A7,  that 

V  (error  of  prediction)  =  Vm(y„,,  -  y   ) 
T  r  T^-T+k   xT+k 

j=0  J 

k-1    .         k-1 

■V<Ji0".Ji«t.J)^».,iH*.J>'] 

k-1 

=  Z       B  J  E(e    .  E    '      .)B  'j 
j=0        -T+k-J  -T+k-j   o 

k-1 
=   £   B  J  G  B  ,J  .  (5.3.5) 

j-0  ° 

Case  2:   We  now  consider  the  unconditional  approach. 

Using  (5.3.5)  and  the  arguments  used  in  Case  1,  we  have 

v(zT+k  -  iT+k)  =EtvT(ZT+k-iT+k)]  +v[ET(%+k-iT+k)] 

k-1 
=    E        BJGB'J+  V(0) 
J-0        ° 
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k-1    •     1 
»  Z       B  JGB  'J 
j=0  °   ° 

■  VT(^T+k  "  ^T+k)-  (5.3.6) 

This  result  can  be  found  by  an  application  of  the  general  result  in 

Jones  (1964).   If  k  =  1,  we  see  that  V(y^,1  -  y*  ,  , )  =  G  which  agrees 

1+1     T+l 

with  our  intuition. 

Another  form  of  the  variance-covariance  matrix  can  be  derived  by 

using  the  form  of  T(0)  in  (4.1.4).   This  implies  that 

k-1  oo 

E   B  J  G  B  ,J  =  T(0)  -  I      B  J  G  B  ,J 
j=0   °     °  j=k  °     ° 

=  T(0)  -  B   [  l      B  J  G  B  ,J]B  ' 
O   •  q   o      o     o 

=  T(0)  -  B  k  T(0)  B  'k.  (5.3.7) 

o        o 

By  the  same  reasoning  that  was  used  in  a  similar  case  in  4.1.2, 
we  can  conclude  that  this  variance-covariance  matrix  of  prediction 
errors  approaches  T(0)  as  k  ■*  °°.   This  result  is  intuitively  appealing, 
since  as  one  predicts  farther  ahead  in  time,  the  information  provided  by 
the  past  observations  becomes  less  important.   Consequently,  the  predic- 
tion variance-covariance  matrix  conditional  on  the  past  values  approaches 
the  unconditional  variance-covariance  matrix  of  the  time  series. 

5.3.3  Prediction  when  B  is  Unknown 

The  more  realistic  prediction  situation  is  to  treat  the  matrix 
of  coefficients,  B,  as  unknown.   In  this  situation  the  predictor  would  be 

iT+k=BT%T.  (5.3.8) 

An  approximation  to  the  variance-covariance  matrix  of  the  predic- 
tion errors  will  be  derived  here.   Our  approach  will  be  similar  in  some 
respects  to  that  of  Box  and  Jenkins  (1976:269)  in  the  scalar  case.   We 
make  the  following  assumption. 
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A15:   The  matrix,  BT  ,  can  be  regarded  as  being  independent  of  y_  . 

Since  ^_  is  used  in  the  calculation  of  the  usual  YW  estimator, 
BT,  we  know  that  A15  is  probably  not  true.   However,  if  T  is  large,  it 
would  seem  that  the  effect  of  y_T  on  B  would  be  relatively  insignificant. 
Thus,  A15  could  be  used  in  deriving  an  approximation  to  the  variance- 
covariance  matrix  of  the  prediction  errors.   We  derive  this  approximation 
by  using  the  mean  vectors  and  variance-covariance  matrices  of  asymptotic 
distributions  determined  by  repeated  applications  of  Lemma  3.3  to 
Lemma  3.2.   We  use  the  notation,  "-" ,  instead  of  "="  at  each  point  where 
an  actual  moment  is  replaced  by  a  moment  of  the  corresponding  asymptotic 
distribution.   Let  B  be  the  true  (unknown)  value  of  B. 

Case  1:   We  first  consider  the  conditional  case. 
An  application  of  Lemma  3.3  to  Lemma  3.2  yields 
E(BQk  -  BTk)  -  Z, 
where  Z  is  a  matrix  of  zeroes.   It  follows  then  from  A15  that 
V(Bok-BTk)%]   =  E(Bok  -  BTk)%  ,  0. 

This  result,    along  with  (5.3.2),  (5.3.8),  A6,  A7,  and  A15 ,  implies  that 

VT(2T+k  "  W  '   VtC  -  BTk)%  +  L      BoJ  ^T+k-j] 

j=0 

-   E    [(y  -   y         )(v  -   v         )  '1 

TLVJLT+k       -*4m-4c     xT+k        XT+k;    J 

=   E[(Bok-BTk)^T'(Bok-BTk)'] 

k-1 

+  E[  I   .    b  J  e_.,    .   e>      .   B    J] 
.    0        o     -T+k-j   -T+k-j      o      J 

k  k  k-1         . 

=  V[(B        -   B      )w    ]   +     E      B   3   G  B    'J.  (5.3.9) 

O  i  1  .     n        O  O 
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The  following  lemmas  will  be  used  to  derive  an  approximation  to 


v[(b  k  -  bJVj. 

o     T  ^T 

Lemma  5.2: 

Let  X  be  an  n  *n  matrix  of  random  variables  for  which  the 

variance-covariance  matrix  of  x =  (X   X   . ...X.  ,X01,...x   ,...,X  ., 

J-J-   i/      In   21    *  2n      nl 

•••»xnn)   is  £.   Let  A  and  B  be  n  x  n  constant  matrices,  r  =  l,2,...,k, 

k 
and   S  =  T,     A     X  B  .   Then  the  variance-covariance  matrix  of 
r=l 


s  =  (S   .8   , 


>S,n>S»,  , .  .  .  ,S   , .  .  .  ,S   ,...,S   )'  is  H  T.   H  ',  where 
J-H  zx      in  nl      nn       s    s 


k 
H  =  I    (A  ®  B  '). 
s     ,  r    r 

r=l 


Proof: 


Let  A  and  B  be  n  xn  constant  matrices  and  P  =  A  X  B.   It  is 
claimed  that  the  variance-covariance  matrix  of 
£-  (p11»p12---.Pln5P2r-.-)P2n,.-.,Pnr...,Pnn)'  is  H  EH  '  where 


P   P 


H   =  (A®  B') 

P 


Let  R=AXandr=  (^ V^l R2n Rnl V' 

Since  R   =  A   X   ,  it  follows  from  a  standard  result  that  the  variance- 
covariance  matrix  of  r  is  H  E  H  ,  where 

'9r.~N 


II 


r    9 


x 
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.    0 


0  A 
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nl 
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0       A120 


A220 


0        A220 
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0       A  ,   0 
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.     .    0 


0  A 
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0 
0 


0  A 
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.    0 
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0        An      0 
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2n 


0        A.      0 
2n 


A        0 
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0        A        0 
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.    0 


.    0 


0  A 
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0  A 
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0  A 
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(5.3.10) 


where   Iq   is    the  n   x  n  identity  matrix. 

No.  1«  T  -  X'  and  t  -  (Iu,Il2 Vn "V-V--V'- 

Then  the  variance-covariance  matrix  of  t  is  H  I   H  ■',  where 


Since  there  exist  unique  r  and  s  such  that  t.  =  T   =  X 

i    rs    sr' 


S  -  (§} 

it  follows  that  the  only  nonzero  element  of  Hfc  ±>  is  that  corresponding 

to  X   which  contains  "1"   Sinrp  v  -  y   -  t    .-u     ■, 

sr  llb   x  •   blnce  *±   ~   xrs    -   Tgr,  the  only  nonzero  element 

of  H^..  is  that  corresponding  to  T^  which  contains  "1".   This  implies 

that  H  =  H  '. 
t    t 
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Let  Q  =  XB.   By  following  the  route  from  X  to  X'  to  B'X'  to  XB, 
the  previous  two  results  imply  that  the  variance-covariance  matrix  of 

a  =  (QirQ12».--.Qln,Q2r...,Q2n,...,Qnl,...,Qnn)'J  is 

H    (B'®I    )H    EH    '(B'®I    )'    H    '    =  H (B'®I    )H    'EH    (B®I   )H    '.      It   is 
u  11      ul  nt  c  ntt  nt 

known  that  Ht(V®W)Ht'  =  (W®V),  where  V  and  W  are  nxn  matrices.   Then 

the  above  variance-covariance  matrix  can  be  written  as  (I  ®B')  E  (I  ®B)  . 

n         n 

Since  P  =  AQ,  it  follows  from  (5.3.10)  that  the  variance-covariance 

matrix  of  _p  is  (A®I  )  (I  ®B')  E  (I  ®B)(A'®I  ).   Simplifying,  we  have, 

nn         n  n       r^o»        > 

(A®In)(In®B')  E  (In®B)(A'®In)  =  (A®B')  Z  (A'®B),  since 

(A®I  )(I  ®B')  =  (AI   ®  I  B')  =  (A®B').   Therefore, 
n   n  n     n  ' 

Hp  =  (A®B').  (5.3.11) 


Let   S^  -  A^XB  ,  s   be  the  corresponding  vector  representation, 

■  4 

s       ldx. 


and  H  ~  W—  >  •   Then  the  variance-covariance  matrix  of  s  is  H  IH  '.   Now 


s   s 


rds.- 

H  =>       l 


sx-  J 
-3  j. J 

k 

E  s 

r=l 

9x. 
J 

k  8s  . 

.  9x. 
r=l   3  • 


k  rgs    ■ 


k 

=   E   H 

1   s 
r=l   r 


where,  from  (5.3.11),  H   =  (A  ®  B  '  ) .   Therefore, 

s      r   r 
r 

k 

H  =  E   (A  ®B  '). 
s         r   r 
r=l 
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Lemma  5.3: 

Let  A  and  B  be  two  n  xn  matrices.   Then  a  first-order  approximation 

to  (A-B)   in  terms  of  powers  of  B  is   A -  £  A'^BA    :l , 

j=0 
Proof: 

The  proof  will  be  by  induction.   For  k  =  2, 

(A-B)2  =  (A-B) (A-B) 

=  A2  -  BA  -  AB  +  B2 

=  A2  -  BA  -  AB 

=  A2  -   E  AJ  BA1_j 
j=0 

as  a  first-order  approximation  in  B.   Suppose  the  result  holds  for  k. 

It  follows  that 

(A-B)k+1  =  (A-B)(A-B)k 

k-1   .        . 
=  (A-B)(Ak  -  z     A1  BAk  1   J) 

j-o 

•  .k+1   _.k   ^I1  J+l  OAk-l-j 
=  A    -  BA  -  L     AJ    BA    J 

j=0 

. k+1    „  , 1  _ . k- 1 

=  A    -   I  AJ  BA  J  , 

J-o 

and  the  result  holds  for  k+1. 

k   k 
To  derive  the  asymptotic  variance-covariance  matrix  of  (B  -B   )]£_, 

k       k 
we  consider  a  two-stage  transformation,  from  B   to  B    to  B  i£   .    The  fol- 
lowing lemma  gives  the  asmptotic  variance-covariance  matrix  for  the  first 
stage  of  the  transformation. 
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Lemma  5.4: 

k  k 

Define  A^  =  BT   and  AQk  -  Bq  .   Let  XJk   and  ^  be  the  corre- 
sponding vector  representations  of  13   and  B   ,  respectively.   Then 

To' 

^^--^^V^'V      as    T+°°' 

where  A^   =  H^H^,  I   =  (G®r  '(0)),  and  ^  =  I  (B  J  ®B  'k_1~J). 

j=0 

f 

Proof: 

From  Lemma  5.3,  we  have, 

B  k-Bk  =  Bk  -  [B  -  (B  -B)]k 
0         O      o     o 

1  B  k-  B  k+  ^  B  j(B  -B)B  k_1-j 
o      o     .  _   o    O     O 

k_1    i        k-l-i 
=   Z   B  J(B  -B)B      J 

j=0   °    °     ° 

=  kB  k  -   E   B  J  B  B  k_1-J. 
°    j-0   ° 

We  consider  a  first-order  approximation  here  because  that  is  what  we 

used  in  the  asymptotic  distribution  results  in  Chapter  III.   The  above 

k    k 
approximation  to  B   -  B   is  just  the  Taylor  expansion  (in  matrix  form) 

k  k 
of  (B  -B  )  about  B   to  the  first-order  term.   Consequently  all  first- 
order  partial  derivatives  with  respect  to  the  B..'s  evaluated  at  B 

ij  o,ij 

can  be  found  in  this  approximating  matrix.   Since  this  matrix  will  be 

k-1         •  ,     ,_. 

evaluated  at   B  =   B    ,    it   is   enough    to   consider     I      BJBBJ 
T  j=0      o        T   o 

for  determining  the  asymptotic  variance-covariance  matrix.   By  applying 

Lemma  5.2  to  Lemma  3.2,  we  can  conclude  from  Lemma  3.3  that 

"^ATk-^-^V^'V   as  T  +  ~' 
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k-1 


J^R      A-l-j. 


where     A    =H    IH    ',        I   =    (G®T      (0)),    and  H,     =      E    (B   J®B    r"   *   J). 

K-     K     K  K.     .  „   O      O 

j=0 


Theorem  5.5: 

With  A^  defined  as  in  Lemma  5.4,  we  have, 

1  k_1    • 

vT(xT+k-iT+k)  -  ?dn  ®V>VJn®V  +    E    bqj  gbo'j 


T  (In®V)Ak(In®V  +  F(0)  "  Bo  r(0)Bo 


A 


Proof: 

From  the  comments  following  (5.3.9),  we  know  that  our  objective 

k    k 
is  to  approximate  V[(Bq  -  Bt  ^J.   Since  ^   is  regarded  as  a  constant 

vector,  applying  Lemma  3.3  to  Lemma  5.4  implies  that 


V[(Bok-BTkV,iH   AH', 


ph .  (A) 
where  Hy  -  l-jy 


and  h .  (A)  =  E  A.,  w  .for  all  i. 
1  -    J-l  ^  ^ 


We  see  that 


n 


H   - 

y 


Tl   T9 


L  o 


yTn 


I/T1  2/T2 


'm  ...'o 


j/Tn...  o 


0      0 


*j  *t*  I  £7  *t»  o 


0 


^ 


n    T 

The  result  follows  by  using  (5.3.9)  with  (5.3.7). 

r-         k     k 
Note  that  the  asymptotic  normality  of  /T[(B   -  B   )i/  ]  was  an 

intermediate  step  in  the  preceding  proof. 
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Case  2:   We  now  consider  the  unconditional  case. 
In  the  conditional  case  it  was  shown  that  E  (y    -  y_   )  -  0. 
Therefore  , 

*(*»*- W  aEIVT(ir+k-  W1" 

From  Theorem  5.5,  we  have  that 

vCEWt-lrHt)^E[(iB«ar')VIn®xr>]  +   s   V  G  B0'j 

j=0 


,k 


f  E[(In®V)Ak(In®2T)]  +  T(0)  -BoKr(0)Bo'K.   (5.3.12) 


where  E[  (In  ®  yx  '^(I  ®  j^)  ]  is  evaluated  by  using  (5.3.13)  through 


(5.3.15)  and  A  is  given  in  Lemma  5.4, 


Since  A^  is  n  x n  ,  we  can  partition  A  into  n  submatrices, 

A,  .  .  ,  each  nxn,  such  that 

K1J 


\- 


•\ll 

\21 


kl2 


k22 


Tcnl     Icn^ 


.  A 


kin 
Ak2n 


tnl    "kn2    '  '   knn 


(5.3.13) 


It  follows  that 


rV\n 


(In^r)\(In®V    " 


V\ln 
V\2n 


Lzt'a 


T      knl 


XT'A 


T      knn 


an®V 
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VWt  •  •  -It\iA 
VWt  ■  ■  ■  VWr 


.'A 


L*T  VA  •  •  •^'Aknn^J 


(5.3.14) 


To  evaluate  the  expectation  in  (5.3.12),  it  is  enough  to  specify 
E(ZT  \ijijJ  for  all  i  and  j.   Nc 


Jow 


n   n 


—         L, 

r=l 

s=l 

yT,r 

ki 

j,rs 

Y* 

n 

n 

=      E 
r=l 

s=l 

*kij 

,rs 

V 

*■* 

Therefore, 


n   n 
E(ZT'\ii  V  =  I        I      A,,  .     E(y„,   y   ) 


n   n 

£   £  A.  .,    T   (0) 
r=l  s=l  klJ'rs   rs 


(5.3.15) 


For  both  the  conditional  and  unconditional  cases,  we  can  estimate 
consistently  these  approximate  variance-covariance  matrices  since  each 
component  can  be  estimated  consistently.   It  is  also  seen  by  observing 
the  expressions  in  Theorem  5.5  and  (5.3.12)  that  the  second  term  is  the 
more  important  term  for  large  T. 


95 


5.4  Prediction  with  the  Spatial  First-Order  Autoregressive 
Multivariate  Time  Series  Model 

If  the  parameters  are  known,  the  variance-covariance  matrix  of 

the  prediction  errors  can  be  obtained  from  the  results  in  5.3.2  using 

Bo  =  Bro-   Consequently  in  this  section,  our  efforts  will  be  directed 

to  those  cases  in  which  the  parameters  are  estimated. 

5.4.1  The  Known  Weights  Case 

The  results  in  this  section  will  be  general  enough  so  that  they 

will  apply  to  both  the  YW#1  and  YW//2  estimators.   A  review  of  the  work 

in  Section  5.3  reveals  that  the  only  consideration  for  our  special  model 

and  estimators,  in  addition  to  B  =  B   ,  is  that  the  E   in  the  A  of 

o    ro  \ 

Lemma  5.4  should  be  replaced  by  the  variance-covariance  matrix  of  the 

asymptotic  distribution  of  vY(6  _,  -  6   ) ,  where  g   and  g   are  the  vector 

~Tl    ro         — rT     — ro 

representations  of  B  _  and  B   ,  respectively. 

rT      ro     f       J 

In  order  to  apply  Lemma  3.3  to  the  results  in  3.2.3,  we  define 
9^  =  (a,b)'.   Let  h  be  a  vector  of  functions  of  Q   such  that 

h(a,b)  =  (g11.g12,--.,gln,g21,...,g2n,...,gnl,...,gnn)', 
where  g   is  a  function  of  Q,    for  all  i  and  j  such  that  g..(a,b)  =  a  for 

*J  11 

all  i  and  g   (a,b)  =  bw..  for  all  i  and  j  4   i. 

J  -J-  J 


Let  H  = 
w 


'3h  (er 

39. 
J  - 


1  0  ...   0     0  10. 


_0  w 


12 


Wln   W21  °  w23  °  • 


0 


.   0  . 


Vi 


"N 
1->i 


.w.     Oj 
n  n-1 


(5.4.1) 
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Applying  Lemma  3.3  to  (5.1.1)  yields 

&(Kt  ~  JL  )  -^—N   z(0,  H  E  H  ')  as  T-», 

— rT   —ro        n  —   w   r  w 


where  H   is  given  in  (5.4.1)  and  E   =  H  EH',  where  H  =  H,  or  H„ , 
w  r    r    r         r    1     2 


- 1 


given  in  3.2.3,  and  I   =    (GOT   (0)) 


The  only  modification  in  this  case  is  to  replace  E  in  A  with 


H  E   H  ' 
w  r  w 


5.4.2   The  Variable  Weights  Case 

As  in  the  known  weights  case,  the  only  modification  to  the  general 

results  of  Section  5.3  is  to  replace  E  with  the  variance-covariance  matrix 

of  the  asymptotic  distribution  of  /F(3   -  $  )  ,  where  R   is  the  vector 

representation  of  the  YW#1  estimator  of  B 

ro 

Let  _0  =  (a,b,a)'  and  h  be  a  vector  of  functions  of  _9  such  that 
h(a,b,a)  =  (gllJg12,...,gln,g21,...,g2n,...,gnl,...,gnn)'> 

where  g..(9^)  =  a  for  all  i  and  g.  .(§)    =  bv..(a)  for  all  i  and  j  ±   i, 

where  v.. (a)  is  given  in  (2.3.2).   Let 
f3h  (9) 


96. 
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v    v   0  v 
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2n 


2n 
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nl 
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n  n-1 
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Pn  n-1 

0 

,  (5.4.2) 


where  v. .  =  v.  .(a  )  and 


3g1 j  (1) 


ij 


da. 


=  b  v. .(a  ) 
o  ij   o 


E  d,,v  (a  )  -  d.. 

.  ,   lk  ik   o      11 
-k=l  J 
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Applying  Lemma  3.3  to  (5.1.2)  yields 

D 


&%,    ~   B   ) *  #  2(0,  H  Z   H  '  )     as   T  ^  oo, 

J--J-    ro       n  —   v  1   v  . 

where  H^  is  given  in  (5.4.2)  and  E  in  (5.1.2). 

The  only  modification  in  this  case  is  to  replace  E  in  A^  with 

H  L  H  '. 
v  1  v 

In  both  the  known  weights  and  variable  weights  cases,  we  can 
estimate  consistently  the  matrices  replacing  E  and  hence,  we  can  esti- 
mate consistently  the  approximate  variance-covariance  matrices. 

k-1 
It  should  be  noted  that   E   B   J  G   B  '? 

^_q   rT   rT   rT   would  probably  not  be  equal 

k  k 

tC>  rrT^  "  BrT   ^T^^tt'    but  if  conver8ence  was  attained  in  the 

calculation  of  rrT(0)  by  (4.2.3),  the  difference  should  be  small. 

5.5   Review  of  Assumptions  Introduced  in  Chapter  V 
The  only  assumption  introduced  in  this  chapter  was  A15  in 
Section  5.3.3. 

A15:   The  matrix,  BT  ,  can  be  regarded  as  being  independent  of  y_  . 


CHAPTER  VI 


EMPIRICAL  RESULTS 


6.0  Preamble 


The  primary  purpose  for  this  chapter  is  to  provide  insight  into 
the  actual  performance  of  the  estimators  presented  in  the  previous 
chapters.   There  are  two  parts  to  the  empirical  results.   In  Section  6.1, 
the  results  of  some  Monte  Carlo  studies  will  be  given  and  in  Section  6.2, 
these  estimation  procedures  will  be  applied  to  a  real  data  example. 

6.1  Monte  Carlo  Studies 
6.1.1  Introduction 

Since  the  work  in  the  variable  weights  case  represents  a  step 
beyond  much  of  the  existing  work  for  spatially-related  random  variables, 
all  simulations  were  performed  using  variable  weights.   Although  the 
exponential  weight  function  can  be  used  for  both  regular  and  irregular 
arrays  of  locations,  regular  arrays  were  chosen  for  our  simulation  work. 
Two  arrays  were  considered,  one  with  13  locations  and  one  with  25. 
The  25-location  array  is  shown  in  Figure  6.1.   Locations  1  through  13 
comprise  the  13-location  array.   (The  number  corresponding  to  each 
location  is  given  below  the  point  for  that  location.)   Adjacent  locations 
on  the  same  row  or  column  are  taken  to  be  one  unit  apart. 
The  model  used  was  of  the  form, 

Xt  =  Br  it_1  +  Et,  (6.1.1) 
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where  B   .  .  =  a  and  B 


r,n 


_  .. =  b  v ,.(a),  j  J4  i.   The  function  v  . 
r»xJ      XJ  ij 


is 


given  by 


-ad.  . 
v. .(a)  = 


ij        n   -ad.. 
v      lk 
L      e 

k=l 


17 

•         •  • 

16    3  18 

•         •         •  *         • 

15    2    5  8   19 

14    1    4    7  10   13   20 

•     •     •  •     . 

25    6    9  12   21 

•         •  ■ 

24   11  22 
23 
Figure  6.1.   Array  of  locations  for  simulati 


ons, 


The  values  of  (a,b,a)  chosen  were  such  that  A6  is  true.   That  is, 
the  roots  of  f(z)  =  |l  -  B^z\    =  0   lie  outside  the  unit  circle.   So  that 

the  variance-covariance  matrix  of  the  error  terms,  G,  would  reflect  the 

-ad.  . 
spatial  dependency,  we  let  G   =  e   1J .   Within  a  given  model  specifica- 
tion (i.e.,  for  a  given  (a,b,a,n)  combination),  the  simulated  errors 
were  to  represent  independent  samples  from  a  N    (0,G)  population  and  thus 
would  satisfy  A7.   (Assumption  A7  states  that  the  e  's  are  independent 
and  identically  distributed  with  mean  0   and  variance-covariance  matrix  G.) 
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Twenty-four  combinations  of  a,  b,  a,  n,  and  T  were  used:   all 
combinations  of 

(a,  b)  =  (.5,  .25),  (.25,  .5), 
a  =  1.0,  5.0,  10.0, 

n  =  13,   25, 
and 

T  =  100,  1000. 

The  results  of  these  simulations  are  reported  in  Sections  6.1.4 
through  6.1.7.   Caution  must  be  exercised  in  interpreting  these  results, 
since  only  one  simulation  was  performed  for  each  combination  of  (a,  b) , 
a,  n,  and  T.   However,  these  results  can  aid  in  developing  a  general 
understanding  of  the  relationships  among  our  estimators  and  the  effect 
of  sample  size  on  our  estimators'  performance. 

To  allow  for  more  specific  interpretations  of  simulation  results 
with  these  combinations  of  (a,  b) ,  a,  n,  and  T,  a  large  number  of  simu- 
lations would  need  to  be  performed  for  each  combination.   However,  the 
substantial  cost,  per  combination,  of  multiple  simulations  prevented  us 
from  performing  them  on  a  large  scale  as  would  be  necessary  here.   We 
did  perform  one  thousand  simulations  for  one  combination  of  (a,  b) ,  a, 
n,  and  T  and  report  those  results  in  Section  6.1.8. 

Before  presenting  any  simulation  results,  we  make  some  comparisons 
of  our  selected  models  in  the  next  two  sections. 

6.1.2  Comparison  of  Models  with  Repect 
to  Assumption  A6 

Since  all  roots  of  f(z)  =  | I  —  B  z |  =0  need  to  be  outside  the 

unit  circle,  it  is  enough  for  the  minimum  absolute  root  to  be  greater 

than  one.   It  was  hoped  that  a  procedure  could  be  developed  which  would 
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allow  a  check  of  A6  by  using  only  a,  b,  and  a.   For  the  univariate  case 
(b  =  0),  the  necssary  and  sufficient  condition  is  |a|  <  1.   For  n  =  2 
(a  neighbor  gets  weight  unity),  it  is  not  difficult  to  show  that  the 
necessary  and  sufficient  conditions  are  |a  ±   b|  <  1.   We  were  unable  to 
find  simple  conditions  like  these  for  n  >  3.   Cliff,  Haggett  et  al.  (1975: 
201)  point  out  that  |a  +  b|  <  1  is  a  necessary  but  not  a  sufficient  con- 
dition.  Since  we  could  not  find  a  simple  expression  for  the  general 
case,  A6  was  checked  numerically. 

If  z  is  such  that  |I  -  3  z|  =0,  then  z  must  be  the  reciprocal  of 
an  eigenvalue  of  B  .   Although  B  is  nonsymmetric,  it  is  of  the  form, 
B  =  C  D,  where  C  and  D  are  both  real  symmetric  matrices  and  C  is  posi- 
tive definite.   For  B  ,  we  have 

r' 


c. . 
ii 

= 

n       -ad . , 
v               ik 
l     e 

k=l 

k^i 

cu 

= 

0, 

D.  . 
11 

= 

a   C.  .  , 
n 

D.  . 
ij 

= 

-ad,  . 
b   e        1J 

J  *   i, 

D. .  =  a  C. . , 
ii      n 

and 

-ad.  , 

J  ¥   i 

The  subroutine,  NROOT  (see  IBM  (1970:166)),  is  designed  for  matrices  of 
this  form  and  was  used  to  find  the  eigenvalues  of  B  . 

For  each  of  the  combinations  of  (a,  b) ,  a,  and  n,  the  minimum  and 
maximum  absolute  roots  of  f(z)  =  |l  -  B  z|  =0  are  given  in  Table  6.1. 

Since  the  minimum  absolute  root  is  our  primary  concern,  it  is 
interesting  to  note  that  this  root  equals  (a  +  b)    for  all  cases.   The 
number  of  locations  and  a  do  have  some  effect  on  the  roots  as  evidenced 
by  the  different  maximum  absolute  roots.   We  will  show  that  (a  +  b)  is 
always  an  eigenvalue  of  B   and  hence  (a  +  b)    is  always  a  root  of 
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f(z)  =  | I  -  Brz|  =  0.   The  other  values  for  the  n  =  2  case,  (a  -  b) , 
(b  -  a) ,  and  -(a  +  b),  are  not  necessarily  eigenvalues  for  n  >  3. 
For  (a  +  b) ,  where  b  ^  0, 

|Br  -  (a  +  b)l|  =  0  iff 

|(al  +  bW)  -  (a  +  b)l|  =0    iff 

|bW  -  bl |  =  0  iff 

|W  -  l|  =0  iff 

n 
unity  is  an  eigenvalue  of  W.   Since  Z     w. .  =  1  for  all  i.  Wr  =  l»r: 


j  =  l 


1J 


where  r'  -  (1,1,..., L).   Therefore,  unity  is  an  eigenvalue  of  W.   (Note 
that  W  can  be  any  scaled  weight  matrix.) 


Table  6.1 
Minimum  and  Maximum  Absolute  Roots  of  f (z)  =  | I  -  B  z |  =  0 


u 


5 

.25 

1.0 

13 

5 

.25 

1.0 

25 

5 

.25 

5.0 

13 

5 

.25 

5.0 

25 

5 

.25 

10.0 

13 

5 

.25 

10.0 

25 

25 

.5 

1.0 

13 

25 

.5 

1.0 

25 

25 

.5 

5.0 

13 

25 

.5 

5.0 

25 

25 

.5 

10.0 

13 

25 

.5 

10.0 

25 

Absolute  Root 


Minimum 


Maximum 


1.333 

2.253 

1.333 

2.206 

1.333 

3.2  75 

1.333 

3.262 

1.333 

3.879 

1.333 

3.879 

1.333 

7.261 

1.333 

6.380 

1.333 

48.47 

1.333 

23.30 

1.333 

346.4 

1.333 

158.7 
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6.1.3  Comparison  of  Models  with  Respect 
to  Weights  and  Correlations 

Scaling  the  weights  so  that  they  add  to  one  for  each  location 
affects  only  their  magnitude  and  not  the  rate  of  change  as  a  function 
of  distance.   In  order  to  understand  the  manner  in  which  the  weights 
vary  for  different  values  of  a,  the  weights  for  the  neighbors  of  loca- 
tion 7  are  given  in  Table  6.2.   Location  7  was  chosen  because  it  is  the 
central  location  for  both  the  13-location  and  25-location  arrays.   It  is 
seen  from  Figure  6.1  that  there  are  three  different  distances  for  the 
neighbors  of  location  7  in  the  smaller  array  and  five  distances  in  the 
larger  array. 

Table  6.2 

Weights  Assigned  to  the  Neighbors  of  Location  7 

Weights 
By  Distance  from  Location  7 


a 

n 

K4) 

^2(4) 

2(4) 

^(o.s) 

3(0,4) 

1.0 

13 

.123 

.0814 

.0453 

1.0 

25 

.0911 

.0602 

.0335 

.0265 

.0123 

5.0 

13 

.221 

.0278 

.00149 

5.0 

25 

.220 

.0277 

.00148 

.000455 

.00000998 

10.0 

1  ) 

.246 

.00391 

.0000112 

10.0 

25 

.246 

.00391 

.0000112 

.00000105 

.000000000507 

From  Table  6.2,  we  see  that  the  differences  in  the  weight  schemes 
for  the  two  arrays  are  sizeable  only  for  a  =  1.   We  know  that,  as  a 
increases,  more  weight  is  assigned  to  the  closest  neighbors  which,  for 
location  7,  are  the  same  in  both  arrays.   However,  when  a  is  closer  to  0, 
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there  is  a  more  even  weight  distribution  over  all  neighbors.   This  effect 
is  illustrated  in  Table  6.2.   Note  that  the  closest  neighbors  of  loca- 
tion 7  in  the  25-location  array  receive  approximately  36  percent  of  the 
total  weight  when  a  =  1  but  approximately  98  percent  when  a  =  10.   From 
our  work  in  Chapter  II,  we  know  that  the  limit  of  the  weights  as  a  ■*  +*> 
is  .25  for  each  of  the  closest  neighbors  of  location  7.   We  see  that  this 
limit  is  almost  reached  when  a  =  10. 

Since  the  relationship  between  the  responses  at  various  locations 
is  included  in  our  model  explicitly  through  a  first-order  time  lag,  we 
consider  the  correlation,  p..(-l),   as  a  measure  of  the  relationship 
between  the  response  at  location  i  and  that  at  location  j  for  the  previ- 
ous time  period.   In  Table  6.3,  the  p   (-l)'s  are  presented  in  order  of 
distance  from  location  7. 

From  Table  6.3,  we  see  that  there  is  a  substantial  difference  in 
the  correlations  between  those  of  the  13-location  array  and  those  of  the 
25-location  array,  with  those  of  the  latter  being  smaller  for  each  dis- 
tance.  The  same  type  of  difference,  where  it  exists,  is  seen  for  the 
weights  in  Table  6.2.   We  know  that  if  more  locations  are  added  to  the 
array,  v.. (a)  (for  fixed  i,  j,  and  a)  decreases.   This  would  suggest 
that   in  the  larger  array,  the  response  at  location  j  provides  less 
information  about  the  response  at  location  i  for  the  next  time  period 
and,  hence,  the  decrease  in  p..(-l).   This  conjecture  may  also  explain 
our  observation  that  the  differences  in  correlations  between  the  two 
arrays  for  a  given  a  are  greater  in  the  (a,  b)  =  (.25,  .5)  case  than  in 
the  (.5,  .25)  case.   Since  more  information  is  drawn  from  the  neighbors 
as  a  whole  in  the  (.25,  .5)  case,  one  might  expect  the  neighbor 
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Table  6.3 

The  Negative  First-Order  Correlation  of  Each  Location 
with  Location  7 


Correlation  by  Distance  from  Location  7 


a    n 0      1 /2_ 2      /5 


.5 

.5 

.25 
.25 

1.0 
1.0 

13 
25 

.602 
.582 

.345 
.313 

.278 
.245 

.210 
.177 

.5 
.5 

.25 
.25 

5.0 
5.0 

13 
25 

.619 
.615 

.359 
.350 

.277 
.249 

.201 
.160 

.5 

.5 

.25 
.25 

10.0 
10.0 

13 

25 

.622 
.619 

.361 
.355 

.275 
.246 

.202 
.158 

.25 
.25 

.5 

.5 

1.0 
1.0 

13 

25 

.478 
.434 

.362 
.310 

.315 
.265 

.262 
.213 

.25 
.25 

.5 

.5 

5.0 
5.0 

13 
25 

.507 
.493 

.384 
.364 

.312 
.273 

.245 
.191 

.25 
.25 

.5 
.5 

10.0 
10.0 

13 

25 

.511 
.499 

.387 
.370 

.308 
.268 

.243 
.188 

.158  .111 

.143  .0868 

.141  .0855 

.197  .155 

.178  .121 

.175  .117 


correlations  not  only  to  increase  in  this  case  for  most  fixed  distances 
(and  P?7(-l)  to  decrease)  but  also  that  the  increase  would  be  greater  in 
the  13-location  array  because  this  increase  in  information  from  the 
neighbors  is  drawn  from  fewer  neighbors.   It  is  also  seen  that  these 
differences  in  correlations  for  the  two  arrays  decrease  as  a  increases 
due  to  the  stabilization  of  the  weights  as  seen  in  Table  6.2. 

We  also  note  that  among  the  neighbors,  the  rate  of  decay  in 
correlations  as  a  function  of  distance  (for  a  fixed  a)  appears  to  be 
greater  in  the  (.5,  .25)  case.   Since  the  contribution  of  the  neighbors 
as  a  whole  is  less  important  in  this  case,  it  seems  reasonable  that  the 
correlation  with  close  neighbors  relative  to  far  neighbors  is  greater  here, 
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6.1.4   The  Yule-Walker  #1  Estimates  of  (a,b,a) 

The  YW#1  estimation  procedures  presented  in  the  previous  chapters 
were  used  on  the  simulated  observations  from  each  model.   For  each  model, 
the  YW//1  estimates  of  a,  b,  and  a,  along  with  the  actual  asymptotic 
standard  deviation  of  each  estimator  (evaluated  at  T) ,  are  presented  in 
Tables  6.4,  6.5,  and  6.6,  respectively.   In  addition,  several  different 
estimates  of  the  asymptotic  standard  deviations  are  given.   These  will 
be  considered  in  6.1.5. 

From  Table  6.4,  we  note  that  for  each  model  for  the  13-location 
array,  a  .  was  closer  to  a  than  its  counterpart  for  the  25-location 
array  even  though  the  standard  deviation  is  smaller  for  the  larger  array. 
The  smaller  standard  deviation  might  be  partially  explained  by  the  fact 
that  more  values  are  averaged  in  the  calculation  of  a  1  in  the  larger 
array.   (Recall  that  G  and  F(0)  are  also  factors  that  affect  the  size 
of  this  standard  deviation.)   It  was  seen  in  Table  6.3  that  p   (-1)  was 
larger  for  n  =  13  than  for  n  =  25  which  may  explain  a   's  being  closer 

to  a  for  the  13-location  array. 

o  ' 

We  note  that  the  actual  asymptotic  standard  deviation  of  a   is 

smaller  for  the  (a,  b)  =  (.5,  .25)  models  than  for  the  (.25,  .5)  models, 

which  may  be  explained  by  the  more  dominant  locational  effect  in  the 

former  case.   In  no  simulation  for  the  13-location  array  was  a   within 

two  standard  deviations  of  a  when  T  =  100.   However,  for  T  =  1000,  a„, 

o  Tl 

is  within  two  standard  deviations  of  a   for  each  case  of  that  array. 

o 

Similar  statements  are  true  for  the  25-location  array  where  (a,  b)  = 
(.25,  -5),  but  for  this  larger  array  and  the   (.5,  .25)  cases,  no  aT1- 
value  is  within  two  standard  deviations  of  .5. 
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Table  6.4 

YW//1  Estimates  of  a  with  Actual  and  Estimated  Asymptotic 
Standard  Deviations  of  a 


Asymptotic  Standard  Deviation*   Convergence 


a 


lT1        Actual     Usual   YW     YW//1 '      YW//1      T(0)      T      (0) 


.5 

.25 

1.0 

13 

100 

.425 

.239 

.231 

.250 

.250 

26 

22 

.5 

.25 

1.0 

13 

1000 

.495 

.0755 

.0754 

.0758 

.0758 

26 

24 

.5 

.25 

1.0 

25 

100 

.328 

.173 

A  ■/; 

** 

** 

26 

ft* 

.5 

.25 

1.0 

25 

1000 

.483 

.0546 

.0544 

.0551 

.0551 

26 

27 

.5 

.25 

5.0 

13 

100 

.420 

.237 

.231 

.250 

.249 

26 

2  3 

.5 

.25 

5.0 

13 

1000 

.495 

.0749 

.0748 

.0753 

.0753 

26 

24 

.5 

.25 

5.0 

25 

100 

.330 

.171 

.164 

.187 

.184 

25 

19 

.5 

.25 

5.0 

25 

1000 

.483 

.0541 

.0539 

.0546 

.0545 

25 

27 

.5 

.25 

10.0 

13 

100 

.416 

.236 

.231 

.250 

.249 

26 

23 

.5 

.25 

10.0 

13 

1000 

.495 

.0746 

.0745 

.0750 

.0751 

26 

24 

.5 

.25 

10.0 

25 

100 

.331 

.171 

.163 

.186 

.183 

25 

20 

.5 

.25 

10.0 

2  5 

1000 

.483 

.0539 

.0538 

.0544 

.0543 

25 

27 

.25 

.5 

1.0 

13 

100 

.190 

.264 

.251 

.268 

.268 

26 

16 

.25 

.5 

1.0 

13 

1000 

.253 

.0835 

.0831 

.0835 

.0835 

26 

23 

.25 

.5 

1.0 

2  5 

100 

.167 

.191 

.169 

.195 

.195 

26 

15 

.25 

.5 

1.0 

2  5 

1000 

.240 

.0605 

.0600 

.0607 

.0607 

26 

22 

.25 

.5 

5.0 

13 

100 

.196 

.257 

.244 

.262 

.263 

26 

16 

.25 

.5 

5.0 

13 

1000 

.255 

.0812 

.0807 

.0812 

.0813 

26 

2  3 

.25 

.5 

5.0 

25 

100 

.165 

.186 

.166 

.191 

.189 

26 

16 

.25 

.5 

5.0 

25 

1000 

.241 

.0589 

.0584 

.0591 

.0591 

26 

22 

.25 

.5 

10.0 

13 

100 

.197 

.254 

.242 

.260 

.261 

26 

16 

.25 

.5 

10.0 

13 

1000 

.256 

.0803 

.0797 

.0804 

.0806 

26 

23 

.25 

.5 

10.0 

25 

100 

.164 

.184 

** 

ft  A 

A  ft 

26 

** 

.25 

.5 

10.0 

25 

1000 

.242 

.0583 

ft* 

ftft 

ft* 

26 

A  A 

*Each  value  was   multiplied  by    10. 
>w<This   value  was   not   calculated   since  a 


II 


could  not  be  calculated. 
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From  these  observations,  it  appears  that  in  order  to  estimate  a, 
a  large  sample  size  is  necessary  particularly  for  more  moderate  sizes 
of  arrays  (e.g. ,  n  =  25)  . 

From  Table  6.5,  we  observe  a  difference  between  the  performance 
of  bT1  for  n  =  13  and  n  =  25.   For  those  models  in  which  (a,  b)  =  (.5,  .25), 
bT1  was  closer  to  .25  for  n  =  13  than  for  n  =  25,  but  for  (n,  T)  =  (13,  100), 
the  distance  from  .25  increased  as  a  increased.   In  the  (.25,  .5)  simula- 
tions, bT1  was  closer  to  .5  for  n  =  25  in  most  cases  and  for  (n,  T)  = 
(25,  100),  the  distance  from  .5  decreased  as  a  increased.   For  (a,  b,  n,  T) 
=  (.5,  .25,  25,  100),  a   and  b   were  nearly  the  same. 

The  change  in  the  standard  deviation  of  b   in  going  from  (.5,  .25) 
to  (.25,  .5)  is  not  as  great  as  that  for  a   .   The  slightly  smaller 
standard  deviations  for  b^  in  the  (.25,  .5)  case  can  probably  be  explained 
by  the  dominant  neighbor  effect,  since  one  might  expect  that  a  stronger 
neighbor  effect  would  allow  one  to  estimate  b  with  more  precision.   For 
bT1,  there  is  also  less  difference  between  standard  deviations  as  one  goes 
from  the  small  to  the  large  array.   In  all  cases,  bT  was  within  two  stan- 
dard deviations  of  bQ.   For  (a,  b,  n)  =  (.5,  .25,  13),  b^  was  within  one 
standard  deviation  of  .25.   The  same  was  true  for  several  cases  in  which 
(a,  b,  n,  T)  =  (.25,  .5,  25,  100). 

From  these  observations,  it  does  not  appear  that  the  need  for 
larger  sample  sizes  is  as  great  for  estimating  b  as  it  is  for  estimating  a. 
It  seems  that  in  those  cases  where  there  is  a  dominant  neighbor  effect, 
bTl  Performs  better  on  a  more  moderate-sized  array  (n  =  25)  than  on 
a  smaller  array,  with  the  performance  improving  somewhat  with  an  increase 
in  a.   However,  in  those  situations  where  the  location  effect  is  dominant, 
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Table   6.5 

YW//1   Estimates    of  b  with  Actual   and   Estimated  Asymptotic 

Standard  Deviations  of  b 

Tl 


a 

b 

a 

n 

I 

bTl 

Actual 

Usual  YW 

u  ueviai 
YW//1' 

:ionK 

YW//1 

.5 
.5 

.25 
.25 

1.0 
1.0 

13 
13 

100 
1000 

.280 
.233 

.676 
.214 

.808 
.220 

.877 
.222 

.791 
.222 

.5 
.5 

.25 
.25 

1.0 
1.0 

25 
25 

100 
1000 

.331 
.277 

.685 

.217 

** 

.212 

** 

.219 

** 
.218 

.5 

.5 

.25 
.25 

5.0 
5.0 

13 
13 

100 
1000 

.296 
.235 

.667 
.211 

.808 
.218 

.870 
.219 

.775 
.219 

.5 
.5 

.25 
.25 

5.0 
5.0 

25 
25 

100 
1000 

.334 
.278 

.682 

.216 

.903 
.210 

1.07 
.217 

.980 
.216 

.5 
.5 

.25 
.25 

10.0 
10.0 

13 
13 

100 
1000 

.302 
.236 

.663 
.210 

.810 
.216 

.871 
.218 

.772 
.218 

.5 

.5 

.25 
.25 

10.0 
10.0 

25 
25 

100 
1000 

.337 
.278 

.679 
.215 

.893 
.209 

1.06 
.216 

.965 
.215 

.25 
.25 

.5 

.5 

1.0 
1.0 

13 
13 

100 
1000 

.413 
.466 

.672 
.212 

.750 
.220 

.857 
.222 

.849 
.222 

.25 
.25 

.5 

.5 

1.0 
1.0 

25 
25 

100 

1000 

.405 
.467 

.674 
.213 

.880 
.228 

1.01 
.232 

.964 
.232 

.25 
.25 

.5 
.5 

5.0 
5.0 

13 

13 

100 
1000 

.407 
.463 

.653 

.207 

.768 
.214 

.873 
.217 

.852 
.218 

.25 
.25 

.5 
.5 

5.0 
5.0 

25 
25 

100 
1000 

.444 
.471 

.668 
.211 

.839 

.227 

.948 
.231 

.896 
.229 

.25 
.25 

.5 
.5 

10.0 
10.0 

13 
13 

100 
1000 

.407 
.462 

.647 
.204 

.770 
.212 

.872 
.215 

.848 
.216 

.25 
.25 

.5 
.5 

10.0 
10.0 

25 
25 

100 
1000 

.453 
.472 

.664 
.210 

** 
** 

** 
** 

** 
** 

*Each  value  was  multiplied  by  10. 
**This  value  was  not  calculated  since  a^      could  not  be  calculated. 
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bT1  appears  to  perform  better  for  smaller  arrays  (n  =  13),  with  perfor- 
mance improving  slightly  as  a  decreases.   It  seems  reasonable  to  expect 
bT1's  performance  to  improve  on  a  larger  array  (i.e.,  with  more  neighbors) 
if  the  importance  of  the  neighbors  increases. 

From  Table  6.6,  we  see  that  there  were  three  (all  three  were  for 
n  =  25)  cases  for  which  o^  was  not  obtained.   For  two  of  these  cases 
(a,  b,  a)  =  (.25,  .5,  10.0),  a   was  probably  too  large  to  be  detected 
by  our  algorithm.   We  note  that  the  existence  of  a  finite  root,  as 
shown  in  3.3.1,  does  not  guarantee  that  it  can  be  found  by  an  algorithm. 
Perhaps  some  modification  to  our  algorithm  would  allow  detection  of 
larger  values  of  CL^  but  it  is  expected  that  there  still  would  be  some 
cases  in  which  o^  cannot  be  found.   We  discuss  this  problem  in  more 
detail  in  6.1.8. 

For  (a,  b)  =  (.5,  .25),  a  was  underestimated  while  for  (a,  b)  = 
(.25,  .5),  a  was  overestimated.   Recall  that  when  (a,  b)  =  (.25,  .5), 
b  was  underestimated  in  all  cases  and  when  (a,  b)  =  (.5,  .25),  b  was  over- 
estimated in  several  cases. 

We  note  that  the  standard  deviation  of  a        is  less  for  the 
models  for  which  (a,  b)  =  (.25,  .5)  than  for  their  counterparts  in  the 
(.5,  .25)  case.   Also  the  standard  deviation  increases  as  a  increases 
and  decreases  as  n  increases.   For  (a,  b)  =  (.5,  .25),  ol  was  within 
one  standard  deviation  of  0^  for  all  cases.   In  addition,  the  hypothesis, 
Hq  :  a  =  0,  can  be  rejected  (using  the  asymptotic  distribution  of  0L,  ) 
at  the  .05  level  for  both  arrays  when  (T,  a)  =  (1000,  1.0)  or  (1000,  5.0) 
but  not  for  T  =  100  or  a  =  10.0.   When  (a,  b)  =  (.25,  .5),  we  know  that 
the  neighbor  effect  as  a  whole  is  given  more  weight  than  when  (a,  b)  = 
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Table  6.6 


YW//1  Estimates  of  a  with  Actual  and  Estimated  Asymptotic 


Standard  Deviations  of 


a, 


Tl 


a 


a, 


Tl 


Asymptotic  Standard  Deviation 
Actual    Usual  YW     YW//1'   YW#1 


.5 

.5 

.25 
.25 

1.0 

1.0 

13 
13 

100 
1000 

.601 
.944 

.713 
.226 

.647 
.239 

.700 
.241 

.699 

.241 

.5 
.5 

.25 
.25 

1.0 
1.0 

25 
25 

100 
1000 

.985 

.478 
.151 

.135 

** 

.138 

** 
.137 

.5 
.5 

.25 
.25 

5.0 
5.0 

13 
13 

100 
1000 

1.86 
4.29 

3.40 
1.07 

1.08 
.911 

1.18 
.921 

1.14 
.921 

.5 

.5 

.25 
.25 

5.0 
5.0 

25 
.  25 

100 
1000 

4.11 
4.77 

3.05 
.965 

2.12 
.781 

2.49 
.800 

2.31 
.799 

.5 
.5 

.25 
.25 

10.0 
10.0 

13 
13 

100 
1000 

2.27 
7.28 

22.1 
7.00 

1.29 
2.59 

1.40 
2.62 

1.35 
2.62 

.5 

.5 

.25 
.25 

10.0 
10.0 

25 

25 

100 
1000 

6.74 
8.20 

20.9 
6.60 

5.47 
2.82 

6.43 
2.90 

5.92 
2.89 

.25 
.25 

.5 

.5 

1.0 
1.0 

13 

13 

100 
1000 

1.71 
1.16 

.385 
.122 

.683 
.142 

.767 
.143 

.771 
.142 

.25 
.25 

.5 
.5 

1.0 
1.0 

25 
25 

100 
1000 

1.32 
1.07 

.255 
.0806 

.449 
.0927 

.512 
.0943 

.499 
.0942 

.25 
.25 

.5 
.5 

5.0 
5.0 

13 
13 

100 
1000 

6.33 
5.87 

1.73 
.548 

3.65 
.811 

4.20 
.822 

4.19 
.820 

.25 
.25 

.5 

.5 

5.0 
5.0 

25 
25 

100 
1000 

6.42 
5.82 

1.55 
.490 

3.38 
.717 

3.86 
.730 

3.67 
.728 

.25 
.25 

.5 
.5 

10.0 
10.0 

13 
13 

100 
1000 

33.1 
33.1 

11.0 
3.48 

171000. 
44200. 

195000. 
44700. 

193000. 
44900. 

.25 
.25 

.5 
.5 

10.0 
10.0 

25 
25 

100 

1000 

10.4 
3.29 

•k  ft 
** 

** 

** 

A  ft 

**This  value  was  not  calculated  since  a   could  not  be  calculated, 
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(.5,  .25).   For  (a,  b)  =  (.25,  .5),  the  hypothesis,  Hq  :  a  =  0,  could  be 
rejected  at  the  .05  level  for  all  cases.   Except  when  a  =  10.0,  c^  was 
within  two  standard  deviations  of  o^.   Note  for  certain  sample  sizes 
that  there  are  some  (a,  b,  a,  n)  combinations  for  which  aQ  is  less  than 
two  standard  deviations  from  0.   For  these  combinations,  it  follows  that 
even  if  c^  =  «,,  the  hypothesis,  Hq  :  a  -  0,  could  not  be  rejected  for   " 
these  sample  sizes. 

There  is  some  indication  that  0^  may  perform  better  for  (a,  b, 
n,  T)  -  (.5,  .25,  25,  100)  than  for  the  corresponding  cases  for  n  =  13, 
but  this  is  not  necessarily  so  for  the  (a,  b)  =  (.25,  .5)  cases.   This 
observation  is  made  despite  the  fact  that  the  standard  deviations  of  a 
are  smaller  for  the  larger  array  than  their  counterparts  for  the  smaller 
array.   The  greatest  improvement  in  the  o^-value  in  going  from  T  =  100 
to  T  =  1000  is  for  the  (a,  b,  n)  =  (.5,  .25,  13)  cases. 

The  results  of  the  simulations  when  ot  =  10  were  somewhat  erratic, 
particularly  in  the  cases  when  (a,  b)  =  (. 25,  . 5) .   This,  along  with  the 
large  asymptotic  standard  deviations,  would  suggest  that  perhaps  in 
situations  where  there  is  a  strong  distance  effect  among  the  neighbors, 
one  might  do  better  by  fitting  a  closest-neighbor  model  (i.e.,  use  known 
weights).   Recall  from  Table  6.2  that  the  weights  were  quite  close  to 
their  limiting  values  for  a  =  10. 

From  these  observations,  it  appears  that  there  is  an  inverse 
relationship  between  bT1  and  c^.   When  a^  overestimates  (underesti- 
mates), bT1  tends  to  underestimate  (overestimate).   By  considering 
the  standard  deviations  of  a^   for  our  cases,  it  appears  that  c^  has 
the  potential  to  perform  better  if  the  neighbor  effect  is  dominant. 
However,  this  conjecture  was  supported  in  only  a  few  cases  of  our  one- 
time (for  each   (a,  b,  a,  n,  T)  combination)  simulations.   If  the 
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locational  effect  is  dominant,  it  seems  that  having  more  locations  in 
the  array  improves  o^'s  performance  for  larger  values  of  a  and  smaller 
sample  sizes.   (In  order  to  detect  a  distance  effect,  one  might  expect 
that  more  neighbors  would  be  needed  if  the  overall  neighbor  effect  is 
weaker.) 

In  considering  a^,  b^,  and  a^,  we  have  not  found  a  situa- 
tion where  all  three  estimators  perform  well  unless  the  sample  size  is 
large.   If  we  consider  the  ratio  of  the  standard  deviation  of  the 
estimator  to  the  actual  parameter  value,  the  larger  ratio  in  the  case  of 
aT1,  relative  to  the  ratios  for  a^  and  b^,  indicates  that  it  is  more 
difficult  to  obtain  information  about  a  than  about  a  or  b. 

6-1-5 Estimates  of  Asymptotic  Standard  Deviations 

Three  estimates  of  each  standard  deviation  are  given  in 

Tables  6.4  through  6.6.   Recall  that  the  asymptotic  variance-covariance 

matrix  of  a^,  b    and  a   evaluated  at  time  T  is  T_1E,  =T_1(M  _1R  )  E 

o   o 

•(Mo   Rq)',  where  Mq  and  Rq   are  given  in  (3.3.29)  and  (3.3.30),  respec- 
tively, and  Z   =  G  ®  r_1(0).   The  estimates,  bn  and  o^,  were  used  to 
estimate  Mq  and  Rq  in  each  of  the  three  procedures  for  estimating!  . 
The  differences  among  the  procedures  involve  the  estimation  of  G  and  T(0). 
The  usual  YW  procedure  uses  GT  and  ^(0),  where  Gy  is  given  by  (4.1.6)  and 
rT(0)  is  given  in  Section  2.1.   The  YW//1'  estimator  involves  G   and  T  (0) 
where  Gn  is  given  by  (4.2.2),  whereas  the  YW//1  estimator  uses  G   and 
rT1(0).   Since  (4.2.3)  was  used  to  calculate  1^(0),  the  number  of  steps 
required  until  convergence  is  given  in  Table  6.4.   The  number  of  steps 
until  convergence  is  also  given  for  the  population  matrix  since  T(0)  was 
calculated  using  (4.1.4).   In  both  cases,  calculations  were  stopped  when 
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convergence  was  attained.   (Note  that  in  many  cases,  r   (0)  converged 
more  quickly  than  did  T(0).)   The  convergence  criterion  used  was  to 
assume  convergence  when  the  absolute  change  in  the  values  of  the 
zeroth-order  correlation,  p   (0) ,  from  one  step  to  the  next  was  less 
than  10   for  all  pairs  of  locations.   In  every  case  T      (0)  was  non- 
singular,  so  that  rT1_1(0)  was  well  defined.   In  this  section,  we 
consider  only  the  estimates  of  the  standard  deviations,  but  in  the  next 
section  we  will  consider  estimates  of  the  off-diagonal  elements  of 
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From  Table  6.4,  we  see  that  there  are  no  sizeable  or  meaningful 
differences  among  the  three  estimators  in  those  cases  where  T  =  1000  and 
so  we  restrict  our  comments  to  those  cases  where  T  =  100.   We  also 
observe  that  for  the  simulations  when  (a,  b)  was  (.5,  .25),  the  usual 
YW  estimators  yielded  underestimates  of  the  standard  deviations,  whereas 
the  YW//l-type  estimators  yielded  overestimates.   (When  we  refer  to  both 
the  YW//1'  and  YW//1  estimators,  we  refer  to  them  collectively  as  the 
YW#l-type  estimators.)  Although  there  is  no  great  discrepancy  among  the 
estimates,  in  each  case  the  usual  YW  estimate  was  closer  to  the  actual 
asymptotic  standard  deviation  than  were  the  YW#l-type  estimates.   There 
was  virtually  no  difference  between  the  YW//l-type  estimates.   In  the 
(a,  b)  =  (.25,  .5)  simulations,  the  usual  YW  estimator  again  always 
yielded  underestimates,  while  the  YW//l-type  estimators  yielded  over- 
estimates.  The  YW/Zl-type  estimates  were  virtually  the  same  in  each 
case  and  both  were  closer  to  the  actual  value  than  was  the  usual  YW 
estimate. 

Although  there  is  only  a  little  difference  among  the  three 
estimates  of  the  asymptotic  standard  deviations  of  a    (even  for  T = 100) , 
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these  observations  suggest  the  usual  YW  estimator  may  perform  slightly 
better  when  the  location  effect  is  dominant  and  the  YW//l-type  estimators 
slightly  better  in  situations  where  the  neighbor  effect  is  dominant. 
From  Table  6.5,  we  see  that  the  only  sizeable  or  meaningful 
differences  among  the  three  estimators  are  in  the  cases  where  T  =  100 
and  so  we  again  restrict  our  comments  to  those  cases. 

For  the  (a,  b)  =  (.5,  .25)  simulations,  all  three  estimators 
overestimated  the  asymptotic  standard  deviations  of  b   .   The  YW#1 
estimate  was  closest  to  the  actual  value  for  n  =  13  but  the  usual  YW 
estimate  was  closest  when  n  =  25.   There  is  a  definite  difference 
between  the  YW//1'  and  YW//1  estimates  with  the  YW//1  estimate  being 
closer  to  the  actual  value.   For  the  (a,  b)  =  (.25,  .5)  simulations, 
all  three  estimators  overestimated  the  actual  value  in  each  case. 
(Recall. that  b  was  underestimated  in  all  of  these  cases.)   There  were 
differences  among  the  estimates  with  the  usual  YW  estimate  being  the 
closest  to  the  actual  value  in  all  cases.   Although  the  YW#l-type 
estimates  were  more  similar  to  one  another  than  to  the  usual  YW 
estimates,  there  were  some  differences  between  them,  with  the  YW#1 
estimate  being  the  closer  to  the  actual  value. 

From  these  observations,  it  appears  that  the  usual  YW  estimator 
may  perform  better  if  the  neighbor  effect  is  dominant  or  if  the  loca- 
tion effect  is  dominant  and  a  moderate-sized  array  (25)  is  considered. 
If  the  locational  effect  is  dominant  and  a  smaller  array  is  considered, 
the  YW//1  estimator  may  perform  better. 

As  for  the  previous  two  parameters,  it  appears,  from  Table  6.6, 
that  the  only  meaningful  differences  among  the  three  estimators  are  in 
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those  cases  where  T  =  100  and  so  our  comments  are  restricted  to  those 
cases.   For  the  (a,  b)  =  (.5,  .25)  simulations,  the  standard  deviation 
is  underestimated  in  all  cases.   (Note  that  a  was  also  underestimated 
in  all  cases.)   All  three  estimates  are  rather  poor  in  most  of  the 
cases  that  we  studied.   However,  the  YW//l-type  estimates  are  closer 
to  the  actual  value  in  each  case  when  (a,  b)  =  (.5,  .25).   Even  though 
the  YW//1'  estimate  was  greater  than  the  YW//1  estimate  in  each  of  these 
cases,  there  were  few  meaningful  differences  between  these  two 
estimates.   For  the  (a,  b)  =  (.25,  .5)  simulations,  both  a  and  the 
asymptotic  standard  deviation  of  0^  were  overestimated  in  each  case. 
Although  the  differences  among  the  three  estimates  of  the  standard 
deviation  in  each  case  do  not  appear  to  be  particularly  meaningful, 
the  usual  YW  estimate  was  the  closest  to  the  actual  value. 

It  appears  to  be  difficult  to  estimate  the  actual  asymptotic 
standard  deviation  of  o^.   In  situations  where  the  neighbor  effect  is 
dominant,  reasonable  estimates  appear  to  be  attainable  only  for  large  T 
and  small  a.   In  such  cases,  it  doesn't  appear  to  matter  which  estima- 
tion scheme  is  used.   In  those  situations  where  the  locational  effect 
is  dominant,  the  estimation  schemes  appear  to  perform  adequately  for 
large  T  even  for  an  a-value  as  large  as  5.0.   For  small  T,  the  procedures 
may  be  adequate  for  a  weaker  distance  effect  (a  =  1)  on  a  smaller  array 
(n  =  13)  and  for  a  stronger  distance  effect  (a  =  5)  on  a  larger  array  (n  =  25) 
These  two  cases  are  the  only  ones  where  some  procedures,  the  YW#l-type, 
appear  to  perform  better  than  another.   As  in  our  consideration  of  a 
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it  would  seem  reasonable  to  use  a  closest-neighbor  specification  if  the 
distance  effect  were  quite  strong  (for  example,  a  =  10). 

From  these  results,  we  make  some  general  observations  concerning 
the  estimation  of  all  three  standard  deviations.   Although  the  usual  YW 
estimates  were  often  less  than  both  YW7/l-type  estimates,  this  is  not 
necessarily  a  good  property.   Closeness  to  the  actual  value  is  a  more 
desirable  property.   (Also,  for  finite  T,  it  would  appear  to  be  more 
desirable  to  overestimate  a  standard  deviation  than  to  underestimate  it 
since  this  would  suggest  that  the  asymptotic  univariate  tests,  given  in 
Chapter  V,  would  be  conservative.)   In  terms  of  closeness,  the  choice  of 
a  standard  deviation  estimator  does  not  appear  to  be  crucial  in  the 

case  of  a        .   Also,  the  asymptotics  seem  to  come  into  effect  more 
"Tl 

slowly  for  the  estimators  of  a        .   Consequently,  it  seems  that  the 

aTl 
choice  could  be  made  relative  to  considerations  for  estimating  a  and 

aTl 
abT/   SlnCe  the  dlfference  in  the  estimates  of  a    were  not  sizeable  in 


our  cases  (even  for  small  T) ,  one  might  choose  a  standard  deviation 

estimation  scheme  (for  smaller  samples)  based  only  on  the  considerations 

for  ab   .   (The  choice  does  not  appear  to  be  crucial  for  large  sample 

sizes  such  as  T  =  1000.)   Unfortunately,  there  does  not  appear  to  be 

any  overall  best  estimator  of  o    and  so  factors  such  as  neighbor  and 

DT1 

location  effects  and  array  size  need  to  be  considered.   (See  our  earlier 

discussion  on  estimates  of  a    .) 

bT1 

6.1.6  Estimates  of  Asymptotic  Covariances 

From  the  results  on  multiparameter  inference  presented  in 
Chapter  V,  it  is  seen  that  the  entire  asymptotic  variance-covariance 
matrix  of  a^,  bT1,  and  a_1  needs  to  be  estimated.   In  6.1.5,  we 
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discussed  the  estimation  of  T  £..  The  three  actual  asymptotic  covar- 
iance  terms  are  presented  in  Tables  6.7,  6.8,  and  6.9,  respectively, 
along  with  the  three  estimates  of  each  term.   The  actual  correlation 
coefficient  is  also  given  for  each  case. 

From  Table  6.7,  we  note  that  the  correlation  coefficient  between 
aT1  and  b,^  is  negative.   This  is  intuitively  appealing  since  we  would 
expect  a  trade-off  between  the  location  and  neighbor  effect  parameter 
estimates.   We  would  expect  that  an  overestimate  (underestimate)  of  a 
would  be  accompanied  by  an  underestimate  (overestimate)  of  b.   A  pair- 
wise  comparison  of  correlations,  in  terms  of  magnitude,  reveals  them  to 
be  larger  for  the  smaller  array  and  for  the  neighbor-dominant  models. 
Since  there  is  general  agreement  among  the  three  covariance  estimates 
for  each  case  when  T  =  1000,  we  restrict  our  comments  to  the  smaller 
sample  simulations.   In  situations  of  a  dominant  locational  effect 
((a,  b)  =  (.5,. 25)),  the  YW//1'  estimator  appears  to  perform  better  for 
the  smaller  array  while  the  usual  YW  estimator  seems  to  perform  better 
for  the  larger  array.   For  those  cases  in  which  the  neighbor  effect  was 
dominant  ((a,  b)  =  (.25,  .5)),  a  YW//l-type  estimator  did  better  than  the 
usual  YW  estimator.   The  difference  in  performance  is  most  clear  for  the 
larger  array. 

From  Table  6.8,  we  see  that  there  is  a  small  positive  asymptotic 
correlation  between  a   and  a   .   Since  these  covariance  estimates 
involve  a   ,  the  poor  performance  of  these  estimators  for  large  a  is 
understandable.   (Recall  that  the  estimates  of  a  were  poor  for  large  a 
in  some  cases.)   Although  there  are  some  reasonable  estimates  for  small 
samples  and  small  or  moderately  large  a,  it  seems  that  these  estimation 
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Table  6.7 
Actual  and  Estimated  Values  of  the  Asymptotic  Covariance 


of  aT1  and 

bTl 

Asymptotic 

Covariance* 
Usual  YW  YW//1' 

a 

b 

a 

n 

T 

Actual  (C 

Drrelations) 

YW//1 

.5 

.25 

1.0 

13 

100 

-.255 

(-.158) 

-.105 

-.241 

-.284 

.5 

.25 

1.0 

13 

1000 

-.0255 

(-.158) 

-.0231 

-.0234 

-.0233 

.5 

.25 

1.0 

25 

100 

-.135 

(-.114) 

** 

A  A 

** 

.5 

.25 

1.0 

25 

1000 

-.0135 

(-.114) 

-.0151 

-.0156 

-.0153 

.5 

.25 

5.0 

13 

100 

-.255 

(-.162) 

-.112 

-.256 

-.305 

.5 

.25 

5.0 

13 

1000 

-.0255 

(-.162) 

-.0233 

-.0237 

-.0237 

.5 

.25 

5.0 

25 

100 

-.130 

C-.112) 

-.0995 

-.213 

-.195 

.5 

.25 

5.0 

25 

1000 

-.0130   ( 

'-.112) 

-.0147 

-.0152 

-.0148 

.5 

.25 

10.0 

13 

100 

-.255    1 

:-.i63) 

-.114 

-.263 

-.313 

.5 

.25 

10.0 

13 

1000 

-.0255    ( 

I-.163) 

-.0233 

-.0238 

-.0238 

.5 

.25 

10.0 

25 

100 

-.130    I 

:-.ii2) 

-.0999 

-.212 

-.193 

.5 

.25 

10.0 

25 

1000 

-.0130   ( 

:-.ii2) 

-.0147 

-.0151 

-.0147 

.25 

.50 

1.0 

13 

100 

-.391    ( 

-.221) 

-.268 

-.275 

-.264 

.25 

.50 

1.0 

13 

1000 

-.0391   ( 

-.221) 

-.0360 

-.0362 

-.0359 

.25 

.50 

1.0 

25 

100 

-.210    ( 

-.163) 

-.0862 

-.172 

-.164 

.25 

.50 

1.0 

25 

1000 

-.0210   ( 

-.163) 

-.0189 

-.0195 

-.0192 

.25 

.50 

5.0 

13 

100 

-.369    ( 

-.220) 

-.210 

-.237 

-.243 

.25 

.50 

5.0 

13 

1000 

-.0369    ( 

-.220) 

-.0339 

-.0341 

-.0337 

.25 

.50 

5.0 

25 

100 

-.191    ( 

-.154) 

-.0882 

-.175 

-.165 

.25 

.50 

5.0 

25 

1000 

-.0191   ( 

-.154) 

-.0168 

-.0176 

-.0176 

.25 

.50 

10.0 

13 

100 

-.358    ( 

-.218) 

-.197 

-.230 

-.239 

.25 

.50 

10.0 

13 

1000 

-.0358   ( 

-.218) 

-.0328 

-.0332 

-.0329 

.25 

.50 

10.0 

25 

100 

-.186    ( 

-.152) 

** 

** 

** 

.25 

.50 

10.0 

25 

1000 

-.0186   ( 

-.152) 

** 

** 

** 

*Each  covariance  value  was  multiplied  by  1000;  correlations  were  not. 
**This  value  was  not  calculated  since  a  -  could  not  be  calculated. 
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Table  6.8 


Actual  and  Estimated  Values  of  the  Asymptotic  Covariance 


of  aT1  and  a 


a 

b 

a 

n 

T 

Actual 

J   — E  " 

(Correlation) 

Usual  YW  YW//1' 

YW//1 

.5 
.5 

.25 
.25 

1.0 
1.0 

13 
13 

100 
1000 

.590 
.0590 

(.0346) 
(.0346) 

.205 
.0659 

.374 
.0590 

.570 
.0541 

.5 

.25 

1.0 
1.0 

25 
25 

100 
1000 

.197 
.0197 

(.0238) 
(.0238) 

** 
.0264 

.0250 

.0219 

.5 

.5 

.25 
.25 

5.0 
5.0 

13 

13 

100 
1000 

5.49 
.549 

(.0682) 
(.0682) 

-.583 
.397 

.691 
.402 

1.43 
.404 

.5 
.5 

.25 
.25 

5.0 
5.0 

25 
25 

100 
1000 

2.60 
.260 

(.0498) 
(.0498) 

1.47 
.274 

3.24 
.274 

2.91 
.260 

.5 

.5 

.25 
.25 

10.0 
10.0 

13 
13 

100 
1000 

42.0 
4.20 

(.0805) 
(.0805) 

-.700 
1.34 

1.01 
1.37 

1.91 
1.38 

.5 
.5 

.25 
.25 

10.0 
10.0 

25 
25 

100 
1000 

20.4 
2.04 

(.0573) 
(.0573) 

4.07 
1.11 

9.33 
1.12 

8.54 
1.07 

.25 
.25 

.5 
.5 

1.0 

1.0 

13 
13 

100 

1000 

.663 
.0663 

(.0652) 
(.0652) 

1.01 
.0734 

1.06 
.0727 

.896 
.0708 

.25 
.25 

.5 

.5 

1.0 
1.0 

25 

25 

100 

1000 

.238 
.0238 

(.0487) 
(.0487) 

.183 
.0256 

.382 
.0262 

.362 
.0249 

.25 
.25 

.5 
.5 

5.0 
5.0 

13 

13 

100 
1000 

5.55 
.555 

(.125) 
(.125) 

7.43 
.756 

8.71 
.767 

8.34 
.762 

.25 
.25 

.5 

.5 

5.0 
5.0 

25 

25 

100 

1000 

2.84 
.284 

(.0984) 
(.0984) 

2.41 
.375 

5.91 
.386 

5.86 
.384 

.25 
.25 

.5 
.5 

10.0 
10.0 

13 

13 

100 

1000 

41.3 
4.13 

(.1^8) 
(.148) 

351000. 
38700. 

439000. 
37600. 

384000. 
36700. 

.25 
.25 

.5 
.5 

10.0 
10.0 

25 
25 

100 
1000 

21.7 
2.17 

(.113) 
(.113) 

** 
** 

*  ft 

** 

** 

*Each  covariance  value  was  multiplied  by  1000;  correlations  were  not. 
**This  value  was  not  calculated  since  a   could  not  be  calculated. 
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procedures  would  perform  best  for  large  samples  and  small  a.   In  such 
a  case,  either  YW//l-type  estimator  would  probably  be  adequate. 

As  we  would  expect  from  our  observations  in  6.1.4,  there  is 
a  strong  negative  correlation  between  bT1  and  o^  ,  as  can  be  seen  in 
Table  6.9.   The  magnitude  of  this  correlation  increases  with  the  addi- 
tion of  locations  to  the  array,  an  increase  in  a,  or  a  decrease  in  the 
neighbor  parameter  coupled  with  an  increase  in  the  location  parameter. 
The  three  estimators  appear  to  perform  poorly  in  the  presence  of  a  strong 
distance  effect  or  in  the  presence  of  a  moderately  strong  distance  effect 
(a  =  5)  coupled  with  a  dominant  neighbor  effect.   Although  there  were 
some  isolated  cases  where  the  estimators  perform  satisfactorily  for  small 
samples,  it  would  seem,  generally,  that  a  large  sample  size  is  necessary 
to  get  reasonable  estimates  of  the  asymptotic  covariance  of  b   and  a 

The  YW//1  estimators  of  a  and  b  (and  the  YW//2  estimator  of  a) 
would  be  the  same  if  the  value  of  a  (and  hence,  the  weights)  were 
regarded  as  being  known.   Consequently,  the  above  results  and  discus- 
sion concerning  these  estimates  and  their  respective  actual  asymptotic 
standard  deviations  and  covariances  in  the  variable  weights  case  are 
also  applicable  to  the  corresponding  known  weights  cases. 

6.1.7  Mean  Squared  Error 

In  order  to  compare  the  YW//1  method  of  estimating  B  with  the 
usual  YW  estimation  procedure,  the  mean  squared  error  was  calculated  for 
each  simulation  for  which  estimates  of  a,  b,  and  a  were  obtained.   The 
mean  squared  error  (MSE)  is  found  in  the  following  way: 

T   n 

=     s  (yt  i  -  yt  t) 

t=l  i=l    '       ' 
MSE  =   x  x  x ' 

n«T  -  k 
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Table  6.9 


Actual  and  Estimated  Values  of  the  Asymptotic  Covariance 


of  t>T1  and  a 

Asymptot 

Lc  Covariai 

ice* 
YW//1' 

a 

b 

a 

n 

T 

Actual 

(Correlation) 

(-.495) 
(-.495) 

Usual  YW 

YW//1 

.5 
.5 

.25 
.25 

1.0 
1.0 

13 

13 

100 
1000 

-.239 
-.0239 

-.209 
-.0258 

-.241 
-.0262 

-.225 
-.0264 

.5 
.5 

.25 
.25 

1.0 
1.0 

25 
25 

100 
1000 

-.199 
-.0199 

(-.607) 
(-.607) 

** 

-.0171 

** 
-.0183 

-.0181 

.5 
.5 

.25 
.25 

5.0 
5.0 

13 
13 

100 
1000 

-1.44 
-.144 

(-.635) 
(-.635) 

-.524 
-.126 

-.615 
-.128 

-.504 
-.128 

.5 
.5 

.25 
.25 

5.0 
5.0 

25 
25 

100 
1000 

-1.51 
-.151 

(-.728) 
(-.728) 

-1.52 
-.119 

-2.12 
-.127 

-1.76 
-.126 

.5 
.5 

.25 
.25 

10.0 
10.0 

13 
13 

100 
1000 

-10.4 
-1.04 

(-.706) 
(-.706) 

-.646 
-.394 

-.756 
-.400 

-.608 
-.399 

.5 

.5 

.25 
.25 

10.0 
10.0 

25 

25 

100 

1000 

-11.1 
-1.11 

(-.783) 
(-.783) 

-4.04 
-.456 

-5.65 
-.488 

-4.65 
-.484 

.25 
.25 

.5 

.5 

1.0 
1.0 

13 

13 

100 
1000 

-.116 
-.0116 

(-.447) 
(-.447) 

-.267 
-.0150 

-.355 
-.0153 

-.359 
-.0152 

.25 
.25 

.5 
.5 

1.0 
1.0 

25 

25 

100 
1000 

-.0956 
-.00956 

(-.556) 
(-.556) 

-.269 
-.0126 

-.347 
-.0132 

-.320 
-.0132 

.25 
.25 

.5 
.5 

5.0 
5.0 

13 
13 

100 
1000 

-.673 
-.0673 

(-.595) 
(-.595) 

-1.85 
-.110 

-2.41 
-.113 

-2.36 
-.113 

.25 
.25 

.5 
.5 

5.0 
5.0 

2  5 
25 

100 

1000 

-.720 
-.0720 

(-.696) 
(-.696) 

-2.28 
-.121 

-2.92 
-.125 

-2.58 
-.123 

.25 
.25 

.5 
.5 

10.0 
10.0 

13 

13 

100 

1000 

-4.78 
-.478 

(-.672) 
(-.672) 

-36100.   • 
-2450. 

-46000.  - 
-2510. 

-45800. 
-2510. 

.25 
.25 

.5 
.5 

10.0 
10.0 

25 
25 

100 
1000 

-5.24 
-.524 

(-.759) 
(-.759) 

** 
** 

** 
** 

ft* 

*Each  covariance  value  was  multiplied  by  10;  correlations  were  not. 
**This  value  was  not  calculated  since  a   could  not  be  calculated. 
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where  ^  =  BT  y^^  for  the  usual  YW  estimator  of  B(=  B  )  ,  £  =  B   v 

for  the  YW//1  estimator  of  B^  and  k  is  the  actual  number  of  estimates 

calculated  by  our  procedures  in  estimating B  .  Thus,  k  =  n2  for  the  usual 

YW  estimator  and  k  =  3  for  the  YW//1  estimator.   The  term,  <y    -  y   ,)  is 

t ,  1    t ,  i ' 

really  an  estimate  of  E   .   Since  all  error  terms  were  assigned  the  same 
variance  in  our  simulations,  the  MSE  is  just  an  estimate  of  unity,  the 
variance  of  the  error  terms  in  our  models.   The  MSB's  and  YW//1  estimates 
of  a,  b,  and  a  for  each  model  are  given  in  Table  6.10. 

Our  criterion  for  evaluating  the  MSE's  for  each  simulation  is  to 
compare  each  calculated  MSE  with  unity.   From  Table  6.10,  we  observe  that 
in  15  of  the  21  comparisons,  the  YW//1  MSE  was  at  least  as  close  to  unity 
as  the  usual  YW  MSE  was.   For  those  cases  where  T  -  100,  YW#1  did  as  well 
as  or  better  than  the  usual  YW  procedure  7  times  out  of  10.   Even  in  each 
case  where  the  usual  YW  MSE  is  closer  to  unity,  the  difference  between  it 
and  the  YW//1  MSE,  in  terms  of  closeness,  is  not  great.   For  (a,  b,  n,  T) 
=  (.5,  .25,  13,  100),  there  is  a  sizeable  difference,  for  all  3  a-values, 
between  the  usual  YW  MSE  and  that  of  YW//1,  with  the  YW//1  MSE  being  closer 
to  unity.   From  these  observations,  it  appears  that  if  the  model  is  of  the 
form  in  (6.1.1),  the  YW//1  procedure  performs  at  least  as  well  as  the  stan- 
dard YW  procedure. 

6.1.8  Multiple  Simulations  with  One  Model 

In  order  to  gain  information  on  the  actual  distribution  of  our 
estimators  for  finite  T,  multiple  simulations  were  performed  for  the  same 
model,  (a,  b,  a,  n)  =  (.5,  .25,  1.0,  13).   Simulations  were  performed 
until  we  had  obtained  one  thousand  values  of  (*l,  ,  b   ,  a  ),  each  based 
on  T  =  100  vector  observations.   Sample  moments  were  calculated  using 
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Table  6.10 

Mean  Squared  Errors  for  Usual  Yule-Walker 
and  Yule-Walker  #1  Estimates 


Yw#l  Estimates  MSE 


a 

b 

a 

n 

T 

Tl 

bTl 

OL,, 
Tl 

Usual  YW 

YW//1 

.5 

.25 

1.0 

13 

100 

.425 

.280 

.601 

.974 

.998 

.5 

.25 

1.0 

13 

1000 

.495 

.233 

.944 

.988 

.987 

.5 

.25 

1.0 

25 

100 

.328 

.331 

■k-k 

1.00 

** 

.5 

.25 

1.0 

25 

1000 

.483 

.277 

.985 

.981 

.984 

.5 

.25 

5.0 

13 

100 

.420 

.296 

1.86 

.973 

.998 

.5 

.25 

5.0 

13 

1000 

.495 

.235 

4.29 

.988 

.987 

.5 

.25 

5.0 

25 

100 

.330 

.334 

4.11 

1.00 

1.01 

.5 

.25 

5.0 

25 

1000 

.483 

.278 

4.77 

.981 

.984 

.5 

.25 

10.0 

13 

100 

.416 

.302 

2.27 

.972 

.999 

.5 

.25 

10.0 

13 

1000 

.495 

.236 

7.28 

.988 

.987 

.5 

.25 

10.0 

25 

100 

.331 

.337 

6.74 

1.00 

1.00 

.5 

.25 

10.0 

25 

1000 

.483 

.278 

8.20 

.981 

.984 

.25 

.5 

1.0 

13 

100 

.190 

.413 

1.71 

.997 

1.02 

.25 

.5 

1.0 

13 

1000 

.253 

.466 

1.16 

1.01 

1.01 

.25 

.5 

1.0 

25 

100 

.167 

.405 

1.32 

1.06 

1.03 

.25 

.5 

1.0 

25 

1000 

.240 

.467 

1.07 

.989 

.990 

.25 

.5 

5.0 

13 

100 

.196 

.407 

6.33 

1.00 

1.03 

.25 

.5 

5.0 

13 

1000 

.255 

.463 

5.87 

1.01 

1.01 

.25 

.5 

5.0 

2  5 

100 

.165 

.444 

6.42 

1.05 

1.03 

.25 

.5 

5.0 

2  5 

1000 

.241 

.471 

5.82 

.990 

.990 

.25 

.5 

10.0 

13 

100 

.197 

.407 

33.1 

1.00 

1.03 

.25 

.5 

10.0 

13 

1000 

.256 

.462 

33.1 

1.01 

1.01 

.25 

.5 

10.0 

25 

100 

.164 

.453 

*  * 

ft  ft 

ft* 

.25 

.5 

10.0 

25 

1000 

.242 

.472 

ft  ft 

ft  ft 

** 

**This  value  was  not  calculated  since  a   could  not  be  calculated, 
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these  1000  estimates.   The  results,  along  with  the  corresponding  moments 
of  the  asymptotic  distribution  evaluated  at  T  =  100,  are  presented  in 
Tables  6.11  and  6.12. 

Table  6.11 
Mean,  Range,  and  Standard  Deviation  of  (a   ,  b   ,  a   ) 


Range 


Standard  Deviation 


Parameter  Actual  Value  Sample  Mean  Min  Max  Sample  Asymptotic 

a           -5         -424  .328  .493  .0243  .0239 

b           -25        .265  -.0161  .493  .0842  .0676 

a           1.0        1.67  -37.8  33.6  4.31 
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Table  6.12 


Covariances  and  Correlations  of  (a„,,  ,  b   ,  a   ) 

Tl'   Tl   Tl 


Parameter 


Cov(a  , ,bm, ) 
Tl'  Tl 


Cov(aT1,otT1) 


Sample 


Asymptotic 


-.000226 
-.000911 
Cov(bT1,orl)      -.139 


Covariance    Correlation Covariance  Correlation 

-.110          -.000255  -.158 

-.00869          .000590  .0346 

-.382          -.0239  -.495 


By  observing  the  range  of  values  of  a   in  Table  6.11,  it 
appears  that  aT1  is  a  biased  estimator  of  a  for  this  model  and  sample 
size.   However,  the  sample  standard  deviation  is  close  to  the  asymptotic 
value.   The  histogram,  constructed  using  the  1000  a   's,was  definitely 
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bell-shaped.   (Recall  that  the  errors  in  our  simulations  were  normally 
distributed.) 

There  does  not  appear  to  be  a  problem  of  biasedness  with  b   . 

Tl 

Both  the  sample  mean  and  midpoint  of  the  range  are  close  to  .25. 
A  plot  of  the  b   's  again  shows  no  obvious  departures  from  normality. 
The  sample  standard  deviation  is  not  as  close  to  the  asymptotic  stan- 
dard deviation  as  it  was  in  the  case  for  a 

Tl 

The  behavior  of  a   appears  to  be  unstable,  as  evidenced  by  the 
very  large  range  and  the  fact  that  the  sample  standard  deviation  is  much 
larger  than  the  asymptotic  value.   The  sample  mean  is  rather  close  to  1.0 
considering  the  instability  of  the  estimator.   Because  of  the  wide  range, 
the  plot  produced  by  our  histogram  algorithm  was  not  very  informative. 

In  the  course  of  generating  the  1000  simulations,  there  were  12 
times  that  our  algorithm  for  calculating  a   was  not  able  to  produce  an 
estimate.   (That  is,  actually  1012  simulations  were  generated,  of  which  the 
results  for  1000  were  used.)   The  probable  reason  for  termination  in  most 
of  these  cases  is  that  the  magnitude  of  a   was  close  to  or  beyond  the 
limits  of  the  computer.   For  these  12  simulations,  the  average  value  of  a 
and  bT1  were  .422  and  .0820,  respectively.   With  this  small  average 
b^-value,  it  is  not  surprising  that  there  were  difficulties  in  calculat- 
ing aT1,  since  if  b   =  0,  a  cannot  be  calculated.   (See  the  discussion  in 
Chapter  II  following  the  introduction   of  A10  and  All.) 

Even  with  these  12  simulations  eliminated,  there  were  some  large 
absolute  values  of  a   included  among  the  1000  values.   With  these  values 
(along  with  3  other  questionable  values)  removed  from  the  sample,  we  have 
a  sample  average  and  standard  deviation  (from  981  a   's)  of  1.209  and  1.431, 
respectively,  which  certainly  agree  more  with  the  asymptotic  values. 
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The  estimate  of  the  covariance  of  a_1  and  b   is  relatively  close 
to  the  asymptotic  value  but  the  same  cannot  be  said  for  the  other  two 
estimates,  probably  due  to  the  fact  that  they  involve  a   . 

The  main  conclusion  to  be  drawn  from  these  repeated  simulations  is 
that  a   behaves  in  a  rather  unstable  manner  for  this  sample  size  and 
model.   Since  this  was  the  only  model  for  which  repeated  simulations  were 
generated,  more  information  on  ot   '  s  behavior  could  probably  be  obtained 
from  similar  exercises  with  models  with  larger  values  of  a  and/or  a 
dominant  neighbor  effect. 


6.2  A  Real  Data  Example 

6.2.1  Introduction 

For  application  of  some  of  the  procedures  developed  in  earlier 
chapters,  we  chose  to  analyze  monthly  unemployment  rates,  from  July  1960 
to  December  1969,  for  10  employment  exchange  areas  in  the  southwestern 
part  of  England.   The  data  and  a  more  complete  description  of  the  exchange 
areas,  along  with  other  background  information,  can  be  found  in  Cliff, 
Haggett  et  al.  (1975:107-113,  239-248).   Although  more  data  were  available, 
we  chose  10  locations  in  a  somewhat  geographically  uniform  area.   The 
names  and  geographic  coordinates  of  the  cities  associated  with  each  of  the 
ten  exchange  areas  are  given  in  Table  6.13. 

6.2.2  Analysis  and  Results 

Examination  of  the  data  indicated  that  it  should  be  adjusted  to 
account  for  a  yearly  trend  in  rates,  which  was  accomplished  by  taking 
twelfth  differences.   Then  the  series  to  be  analyzed  was  formed  by  removing 
the  sample  mean.   Thus,  we  have 
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where  x^.  =  z.t+12  -  z^      and  the  z  »a  are  the  unadjusted  monthly  rates. 
The  observations  to  be  analyzed  with  our  procedures  are  the  u    's 
t  =  1,2,..., 101.   August  1961  and  December  1969  correspond  to  t  =  1  and 
t  =  101,  respectively.   (It  is  necessary  to  consider  u      in  order  to  cal- 
culate the  MSE's.) 

In  fitting  the  data,  we  assumed  the  spatial  first-order  auto- 
regressive  model  for  y_  .   That  is, 

*t  =  Br*t-1+V 

where  B   and  £      satisfy  A6  and  A7. 

The  YW//1  estimates  of  a,  b,  and  a,  along  with  the  YW//1  estimates 
of  their  respective  asymptotic  variances  and  covariances,  are  given  in 
Table  6.14. 

Table  6.13 
Names  and  Coordinates  of  Employment  Exchange  Cities 


Coordinates  (in  Degrees) 


city North West 

Barnstaple  51.10  4.08 

Bodmin  50.48  4.75 

Dartmouth  50.55  3.47 

Dorchester  50.75  2.57 

Exeter  50. 75  3.55 

Honiton  50.82  3.17 

Launceston  50.63  4.43 

Plymouth  50.42  4.23 

Salisbury  50.58  1.85 

Torquay  (Torbay)  50.50  3.43 


Source:   Espenshade  (1970:192-315). 
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Table  6.14 

YW#1  Estimates  of  (a,  b,  a)  and  the  Asymptotic 
Covariance  Matrix  of  (a^,  b^,  n) 


Parameter 


Estimate 


a 

b 
a 


Standard  deviation  of  a 
Standard  deviation  of  b 
Standard  deviation  of  a 


Tl 
Tl 
tl 


Gov  (aT1,  b  )  (Correlation) 
c°v  (aT1,  o^J  (Correlation) 
C°V  ^bTl'  aTl^  (Correlation) 


.560 

.344 

.694 

.0255 

.0619 

1.19 

-.000756 

(-.478) 

.000986 

(    .0325) 

-.0200 

(-.272) 

In  calculating  the  YW//1  estimate  of  the  asymptotic  covariance 
matrix  of  (a^,  b^,  o^) ,  1^(0)  converged  in  70  steps  (using  the  same 
criterion  as  in  the  Monte  Carlo  studies)  but  we  forced  the  summation  to 
go  101  steps.   However,  a  comparison  of  the  estimated  standard  deviations 
of  aT1,  bT1,  and  a^  obtained  by  stopping  after  70  steps  with  those 
obtained  after  101  steps  revealed  no  appreciable  difference. 

Using  the  univariate  procedures  presented  in  5.1.2,  we  construct 
approximate  95%  confidence  intervals  for  a,  b,  and  a,  which  are  (.509, 
.611),  (.220,  .468)  and  (-1.69,  3.07),  respectively.   On  the  basis  of 
these  confidence  intervals,  we  can  conclude  that  there  are  positive 
location  and  neighbor  effects.   However,  if  there  is  a  distance  effect 
among  the  neighbors,  we  were  not  able  to  detect  it.   In  case  there  is  not 
very  strong  evidence  of  a  distance  effect  (at  least,  according  to  our 
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weight  structure),  a  comparison  of  MSE's  may  indicate  that  one  should 
regard  the  weights  as  known  and  equal  for  all  neighbors  (i.e.,  a  =  0) . 
Using  (aT1,  bT1,  a^)  the  MSE,  in  this  example,  is  14.62,  whereas  if 

^aTl'  bTl^  Were  USed  ^and  a  were  taken  to  be  °) »  the  MSE  is  14.73. 

It  is  difficult  to  evaluate  an  individual  MSE-value  in  a  real  data  case 
because  we  do  not  know  the  variance  of  the  error  terms  as  we  did  in  the 
Monte  Carlo  studies,  but  the  model  with  a  =  0  seems  to  fit  nearly  as  well 
as  the  other. 

In  order  to  determine  which  effect,  location  or  neighbor,  is 
dominant,  we  can  construct  a  confidence  interval  for  (a  -  b)  using 
standard  results.   We  have  (a,^  -  b  )  =  .216  and  an  estimate  of  the 
standard  deviation  of  (a,^  -  bT1>  to  be  .0774.   An  approximate  95%  confi- 
dence interval  for  (a  -  b)  is  (.0612,  .371)  which  allows  us  to  conclude 
that  the  location  effect  is  dominant. 
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